Windows.Carving.Qakbot

This artifact enables Qakbot payload detection and configuration extraction from a byte stream, process or file on disk. The artifact runs a yara scan as a detection step, then attempts to process the payload to extract configuration.

QakBot or QBot, is a modular malware first observed in 2007 that has been historically known as a banking Trojan. Qbot is used to steal credentials, financial, or other data, and in recent years, regularly a loader for other malware leading to hands on keyboard ransomware.

Qakbot (Qbot) payloads have an RC4 encoded configuration, located inside two PE resources. Encoded strings and xor key can also be found inside the .data section starting at a specific offset.

Some of the options available cover changes observed in the last year in the decryption process to allow simplified decoding workflow:

  • StringOffset - the offset of the string xor key and encoded strings.
  • PE resource type - the configuration is typically inside 2 resources.
  • Unescaped key string - this field is typically reused over samples
  • Type of encoding: single or double, double being the more recent.
  • Worker threads for bulk analysis / research use cases.

The decryption used is fairly simple with the first pass RC4 found in encoded strings embedded in the malware and is often reused from previous samples.

Each decoded output includes the first 20 bytes in hex as the SHA1 of the data as verification. The second pass RC4 key is the next 20 bytes in hex, Second pass RC4 decoding has the same verification of decrypted data.

NOTE: Requires 0.6.8 for PE dump


name: Windows.Carving.Qakbot
author: Matt Green - @mgreen27
description: |
    This artifact enables Qakbot payload detection and configuration extraction 
    from a byte stream, process or file on disk. The artifact runs a yara scan 
    as a detection step, then attempts to process the payload to extract 
    configuration.
    
    QakBot or QBot, is a modular malware first observed in 2007 that has been 
    historically known as a banking Trojan. Qbot is used to steal credentials, 
    financial, or other data, and in recent years, regularly a loader for other 
    malware leading to hands on keyboard ransomware. 
    
    Qakbot (Qbot) payloads have an RC4 encoded configuration, located inside two 
    PE resources. Encoded strings and xor key can also be found inside the .data 
    section starting at a specific offset. 
    
    Some of the options available cover changes observed in the last year in the 
    decryption process to allow simplified decoding workflow:
    
    - StringOffset - the offset of the string xor key and encoded strings.
    - PE resource type - the configuration is typically inside 2 resources.
    - Unescaped key string - this field is typically reused over samples
    - Type of encoding: single or double, double being the more recent.
    - Worker threads for bulk analysis / research use cases.

    The decryption used is fairly simple with the first pass RC4 found in 
    encoded strings embedded in the malware and is often reused from previous 
    samples. 
    
    Each decoded output includes the first 20 bytes in hex as the SHA1 of the 
    data as verification. The second pass RC4 key is the next 20 bytes in hex, 
    Second pass RC4 decoding has the same verification of decrypted data.
    
    NOTE: Requires 0.6.8 for PE dump

reference:
  - https://malpedia.caad.fkie.fraunhofer.de/details/win.qakbot
  - https://docs.velociraptor.app/blog/2023/2023-04-05-qakbot/
type: CLIENT


parameters:
  - name: TargetBytes
    description: Parameter to enabling piping a byte stream of a payload dll
    default:
    type: hidden
  - name: TargetGlob
    description: Glob to target payloads on disk
    default: 
  - name: PidRegex
    description: Regex of target Process ID to scan
    default: .
    type: regex
  - name: ProcessRegex
    description: Regex of target Process Name to scan
    type: regex
  - name: StringOffset
    description: Offset of beginning of encoded strings in .data section. 
    default: 0x50
    type: int
  - name: ResourceRegex
    description: Regex to select targeted PE resource name.
    default: 'BITMAP|RCDATA'
  - name: Keys
    description: Lookup table of recent Keys. Add additional keys to extend capability.
    type: csv
    default: |
        Type,Key
        double,Muhcu#YgcdXubYBu2@2ub4fbUhuiNhyVtcd
        double,bUdiuy81gYguty@4frdRdpfko(eKmudeuMncueaN
        single,\System32\WindowsPowerShel1\v1.0\powershel1.exe
        single,\System32\WindowsPowerShell\v1.0\powershell.exe
  - name: Workers
    description: Number of workers to run. For bulk usecase increase to improve performance.
    default: 1
    type: int
  - name: YaraRule
    description: Yara rule to detect Qakbot payload.
    type: hidden
    default: |
        rule win_qakbot {
            meta:
                author = "Felix Bilstein - yara-signator at cocacoding dot com"
                date = "2023-01-25"
                description = "Detects win.qakbot."
            strings:
                $sequence_0 = { 50 e8???????? 8b06 47 }
                $sequence_1 = { e9???????? 33c0 7402 ebfa }
                $sequence_2 = { 740d 8d45fc 6a00 50 }
                $sequence_3 = { 8b06 47 59 59 }
                $sequence_4 = { eb13 e8???????? 33c9 85c0 0f9fc1 41 }
                $sequence_5 = { 7402 ebfa 33c0 7402 }
                $sequence_6 = { 0fb64903 c1e008 0bc2 c1e008 0bc1 c3 55 }
                $sequence_7 = { ebfa eb06 33c0 7402 }
                $sequence_8 = { 8d45fc 6aff 50 e8???????? 59 59 }
                $sequence_9 = { 59 59 6afb e9???????? }
                $sequence_10 = { 48 50 8d8534f6ffff 6a00 50 }
                $sequence_11 = { 5e c9 c3 55 8bec 81ecc4090000 }
                $sequence_12 = { e8???????? 83c410 33c0 7402 }
                $sequence_13 = { 7cef eb10 c644301c00 ff465c 8b465c 83f838 }
                $sequence_14 = { eb0b c644301c00 ff465c 8b465c 83f840 7cf0 }
                $sequence_15 = { c644061c00 ff465c 837e5c38 7cef eb10 c644301c00 }
                $sequence_16 = { 7507 c7466401000000 83f840 7507 }
                $sequence_17 = { 85c0 750a 33c0 7402 }
                $sequence_18 = { 72b6 33c0 5f 5e 5b c9 c3 }
                $sequence_19 = { 7402 ebfa e9???????? 6a00 }
                $sequence_20 = { c7466001000000 33c0 40 5e }
                $sequence_21 = { 6afe 8d45f4 50 e8???????? }
                $sequence_22 = { 7402 ebfa eb0d 33c0 }
                $sequence_23 = { 50 ff5508 8bf0 59 }
                $sequence_24 = { 57 ff15???????? 33c0 85f6 0f94c0 }
                $sequence_25 = { ff15???????? 85c0 750c 57 ff15???????? 6afe }
                $sequence_26 = { c3 33c9 3d80000000 0f94c1 }
                $sequence_27 = { 6a02 ff15???????? 8bf8 83c8ff }
                $sequence_28 = { 6a00 58 0f95c0 40 50 }
                $sequence_29 = { e8???????? 33c0 c3 55 8bec 51 51 }
                $sequence_30 = { 7412 8d85d8feffff 50 57 ff15???????? }
                $sequence_31 = { 00e9 8b55e4 880c1a 8a4df3 }
                $sequence_32 = { 00ca 66897c2446 31f6 8974244c }
                $sequence_33 = { 01c1 894c2430 e9???????? 55 }
                $sequence_34 = { 01c1 81e1ffff0000 83c101 8b442474 }
                $sequence_35 = { 00e9 884c0451 83c001 39d0 }
                $sequence_36 = { 01c1 8b442448 01c8 8944243c }
                $sequence_37 = { 01c1 894c2404 8b442404 8d65fc }
                $sequence_38 = { 01c1 21d1 8a442465 f6642465 }
            condition:
                7 of them and filesize < 1168384
        }

    
sources:
  - query: |
      -- parses PE and extracts EncodedStrings from the.data section
      LET encoded_strings(data) = SELECT 
            strip(suffix='\x00\x00', string=_value) as Sections
        FROM foreach(row=split(sep='\x00\x00\x00\x00',string=data))
        WHERE Sections
      LET find_data(data) = SELECT
            encoded_strings(data=read_file(filename=data,accessor='data',offset=FileOffset,length=Size)[StringOffset:]).Sections as EncodedStrings
        FROM foreach(row=parse_pe(file=data,accessor='data').Sections,
                query={ SELECT * FROM _value })
        WHERE Name = '.data'
        
      -- decodes strings only show printable
       LET decode_strings(data) = SELECT * FROM foreach(
        row={ 
            SELECT count() - 1 as Index
            FROM range(start=0, end=len(list=data))},
        query={
            SELECT 
                filter(
                    list=split(sep='\x00',string=xor(key=data[Index],string=_value)),
                    regex="^[ -~]{2,}$" ) as String
            FROM foreach(row=data[Index:])
            WHERE len(list=String) > 2 
                AND NOT String =~ '^\\s+$'
        })
            
      -- parses PE and extracts resource sections        
      LET find_resource(data) = SELECT Type, TypeId,
            FileOffset,
            DataSize,
            read_file(filename=data,accessor='data',offset=FileOffset,length=DataSize) as Data
        FROM foreach(row=parse_pe(file=data,accessor='data').Resources)
        WHERE Type =~ ResourceRegex
        ORDER BY DataSize
      
      -- first round of RC4 encoding. Verification hash is hex of first 20 bytes
      LET rc4_wth_hashed_key(data,key) = 
            crypto_rc4(
                key = unhex(string=hash(path=key,accessor='data',hashselect='SHA1').SHA1),
                string = data )
      -- second round of RC4 encoding accounting for verification.
      LET advanced_method(data,key)= 
            crypto_rc4(
                key = rc4_wth_hashed_key(data=data,key=key)[20:40],
                string = rc4_wth_hashed_key(data=data,key=key)[40:] )
      
      -- this function finds key and verifies results.
      LET decode(data) = SELECT Key,Type,
            if(condition= Type='single',
                then= rc4_wth_hashed_key(data=data,key=Key),
                else= advanced_method(data=data,key=Key)) as Data
        FROM Keys
        WHERE format(format='%x',args=Data[:20]) = hash(path=Data[20:],accessor='data',hashselect='SHA1').SHA1
        LIMIT 1
        
      -- find netaddress method with the most expected standard ports.
      LET find_c2(methods) = SELECT _value as C2, 
            len(list=filter(list=_value,regex=':(443|80|([0-9])\1{4,})$')) as Total 
        FROM foreach(row=methods) ORDER BY Total DESC
        LIMIT 1
        
      -- bytestream: only works on a payload dll as bytestream
      LET bytestream = SELECT Rule as Detection,
            hash(path=TargetBytes, accessor='data') as DataBytes,
            len(list=TargetBytes) as Size,
            find_resource(data=TargetBytes,accessor='data') as Resources,
            find_data(data=TargetBytes,accessor='data')[0].EncodedStrings as DecodedStrings
        FROM yara(  files=TargetBytes,
                    accessor='data',
                    rules=YaraRule, key='X',
                    number=1 )

      -- find target files
      LET target_files = SELECT OSPath, Size,
                Mtime, Btime, Ctime, Atime 
        FROM glob(globs=TargetGlob)
        WHERE NOT IsDir AND Size > 0
        
      -- search for qakbot in scoped files
      LET file_payloads = SELECT * FROM foreach(row= target_files,
            query={
                SELECT
                    Rule as Detection,
                    OSPath,Size,
                    dict( 
                        Mtime = Mtime, 
                        Atime = Atime,
                        Ctime = Ctime,
                        Btime = Btime
                            ) as Timestamps,
                    find_resource(data=read_file(filename=OSPath),accessor='data') as Resources,
                    find_data(data=read_file(filename=OSPath),accessor='data')[0].EncodedStrings as DecodedStrings
                FROM yara(  files=OSPath,
                            rules=YaraRule,
                            end=Size,  key='X',
                            number=1 )
            })
            WHERE log(message="Scanning file : %v", args=OSPath)
      
      -- find processes in scope of query
      LET processes = SELECT int(int=Pid) AS Pid,
              Name, Exe, CommandLine, CreateTime,Username
        FROM process_tracker_pslist()
        WHERE Name =~ ProcessRegex
            AND format(format="%d", args=Pid) =~ PidRegex
            AND log(message="Scanning pid %v : %v", args=[Pid, Name])
      
      -- find unbacked sections with xrw permission
      LET sections = SELECT * FROM foreach(
          row=processes,
          query={
            SELECT CreateTime as ProcessCreateTime,Pid, 
                Name as ProcessName,
                Exe,
                CommandLine,
                Username,
                Address as Offset,
                Size,
                pathspec(
                    DelegateAccessor="process",
                    DelegatePath=Pid,
                    Path=Address) AS _PathSpec
            FROM vad(pid=Pid)
            WHERE MappingName=~'^$'
                AND Protection='xrw'
                AND NOT State =~ 'RESERVE'
          })
      
      -- search for qakbot in suspicious sections
      LET process_hits = SELECT *
        FROM foreach(row= sections,
            query={
                SELECT
                    Rule as Detection,
                    dict( 
                        ProcessCreateTime = ProcessCreateTime,
                        Pid = Pid,
                        ProcessName = ProcessName,
                        Exe = Exe,
                        CommandLine = CommandLine,
                        Username = Username,
                        Offset = Offset,
                        PayloadSize = Size
                    ) as ProcessInfo,
                    find_resource(data=pe_dump(in_memory=Size,pid=Pid,base_offset=Offset),accessor='data') as Resources,
                    find_data(data=pe_dump(in_memory=Size,pid=Pid,base_offset=Offset),accessor='data')[0].EncodedStrings as DecodedStrings
                FROM yara(  accessor='offset',
                            files=_PathSpec,
                            rules=YaraRule,
                            end=Size,  key='X',
                            number=1 )
            })
      
      -- decode campaign from larger resrouce
      LET decode_campaign = SELECT *,
        decode(data=Resources[0].Data)[0] as Campaign,
        decode_strings(data=DecodedStrings).String as DecodedStrings
      FROM foreach(row={ SELECT * FROM switch(
                                a = if(condition=TargetBytes, then=bytestream),
                                b = if(condition=TargetGlob, then=file_payloads),
                                c = process_hits )
                        }, workers = Workers)
        
      -- decode raw C2 data from larger resource
      LET decode_c2 = SELECT *
            advanced_method(data=Resources[1].Data,key=Campaign.Key)[20:] as _C2Raw
          FROM decode_campaign
    
      -- profile to parse Qakbot C2 data: LE netaddress with a seperator
      LET PROFILE = '''
           [
             ["Qakbot1", 0, [
               ["Method1", 0, "Array",
                     {  "type": "Entry","count": 200,
                        "sentinel": "x=>x.C2 = '0.0.0.0:0'"
                     }],
               ]],
              ["Entry", 8, [
               ["__IP", 1, "uint32"],
               ["__PORT", 5, "uint16b"],
               ['C2',0,'Value',{'value':"x=>format(format='%s:%d', args=[ip(netaddr4_le=x.__IP),x.__PORT])"}],
             ]],
             ["Qakbot2", 0, [
               ["Method2", 0, "Array",
                     {  "type": "Entry2", "count": 200,
                        "sentinel": "x=>x.C2 = '0.0.0.0:0'"
                     }],
               ]],
              ["Entry2", 7, [
               ["__IP", 1, "uint32"],
               ["__PORT", 5, "uint16b"],
               ['C2',0,'Value',{'value':"x=>format(format='%s:%d', args=[ip(netaddr4_le=x.__IP),x.__PORT])"}],
             ]]
           ]'''
           
      -- calculate C2 IPs using two observed methods
      LET results = SELECT *,
            Campaign.Key as Key,
            parse_string_with_regex(string=Campaign.Data,
                regex=[
                        '''3=(?P<Timestamp>[ -~]*)\r\n''',
                        '''10=(?P<Name>[ -~]*)\r\n'''
                    ]) as Campaign,
            parse_binary(filename=_C2Raw,accessor='data', profile=PROFILE, struct="Qakbot1").Method1['C2'] as _C2method1,
            parse_binary(filename=_C2Raw,accessor='data', profile=PROFILE, struct="Qakbot2").Method2['C2'] as _C2method2
        FROM decode_c2
      
      -- finally determine C2 encoding and pretty timestamp then output rows and remove unwanted fields
      SELECT * FROM column_filter(
        query={
            SELECT *,
                Campaign + dict(Timestamp=timestamp(epoch=Campaign.Timestamp)) as Campaign,
                --upload(accessor='scope',file='_C2Raw') as C2RawUpload, --uncomment to troubleshoot bad C2
                find_c2(methods=[_C2method1,_C2method2])[0].C2 as C2
            FROM results
        }, exclude=["_C2method1","_C2method2","_C2Raw","Resources"])