mirror of https://github.com/ail-project/ail-framework.git synced 2024-11-22 22:27:17 +00:00

History

terrtia c5f40d85a8 fix: [doc] fix pystemon install		2024-02-08 14:14:14 +01:00
..
presentation	chg: [update doc] update doc install + logo + fix updater	2020-04-20 17:50:40 +02:00
screenshots	chg: [README.md] update	2023-06-01 14:19:05 +02:00
ail_modules_queues.png	chg: [doc] add AIL v5.0 + objects + Importers + sync	2023-06-05 16:14:29 +02:00
api.md	chg: [doc] add AIL v5.0 + objects + Importers + sync	2023-06-05 16:14:29 +02:00
README.md	fix: [doc] fix pystemon install	2024-02-08 14:14:14 +01:00

README.md

AIL objects

AIL is using different types of objects to classify, correlate and describe extracted information:

Cryptocurrency: Represents extracted cryptocurrency addresses.
- bitcoin
- bitcoin-cash
- dash
- ethereum
- litecoin
- monero
- zcash
Cve: Represents extracted CVE (Common Vulnerabilities and Exposures) IDs.
Decoded: Represents information that has been decoded from an encoded format, such as base64.
Domain: Represents crawled domains and includes metadata related to them.
Item: Represents a piece of text that has been processed by AIL. It can include various types of extracted information.
Pgp: Represents PGP key/block metadata.
- key: PGP key IDs
- mail: email addresses associated with PGP keys
- name: names associated with PGP keys.
Screenshot: Represents screenshots captured from crawled domains.
Title: Represents the HTML title extracted from web pages.
Username:
- telegram: telegram username handles
- twitter: twitter username handles
- jabber: Jabber (XMPP) username handles

AIL Importers

AIL Importers play a crucial role in the AIL ecosystem, enabling the import of various types of data into the framework.

These importers are located in the /bin/importer directory. The modular design of importers allows for easy expansion and customization, ensuring that AIL can adapt to new types of data.

Available Importers:

AIL Feeders: Extract and feed JSON data from external sources via The API.
ZMQ
pystemon
File: Import files and directories. (Manually Feed File/Dir: ./tool/file_dir_importer.py).

pystemon:

Clone the pystemon's git repository:
```
git clone https://github.com/cvandeplas/pystemon.git
```
Clone it into the same directory as AIL if you wish to launch it via the AIL launcher.

Edit configuration file for pystemon pystemon/pystemon.yaml:

Configure the storage section according to your needs:

storage:  
	archive:  
		storage-classname:  FileStorage  
		save: yes  
		save-all: yes  
		dir: "alerts"  
		dir-all: "archive"  
		compress: yes

	redis:  
		storage-classname:  RedisStorage  
		save: yes  
		save-all: yes  
		server: "localhost"  
		port: 6379  
		database: 10  
		lookup: no

Adjust the configuration for paste-sites based on your requirements (remember to throttle download and update times).

Install python dependencies inside the virtual environment:

cd ail-framework/
. ./AILENV/bin/activate
cd ../pystemon/
pip install -U -r requirements.txt

Edit the configuration file ail-framework/configs/core.cfg:
- Modify the "pystemonpath" path accordingly.

Launch ail-framework, pystemon and PystemonImporter.py (all within the virtual environment):

Option 1 (recommended):

 ./ail-framework/bin/LAUNCH.sh -l #starts ail-framework
 ./ail-framework/bin/LAUNCH.sh -f #starts pystemon and the PystemonImporter.py

Option 2 (may require two terminal windows):

./ail-framework/bin/LAUNCH.sh -l #starts ail-framework
./pystemon/pystemon.py
./ail-framework/bin/importer/PystemonImporter.py

File Importer `importer/FileImporter.py`:

Manually import File and Directory with the ./tool/file_dir_importer.py script:

Import Files:

. ./AILENV/bin/activate
cd tools/
./file_dir_importer.py -f MY_FILE_PAT

Import Dirs:

. ./AILENV/bin/activate
cd tools/
./file_dir_importer.py -d MY_DIR_PATH

Create a New Importer:

from importer.abstract_importer import AbstractImporter
from modules.abstract_module import AbstractModule

class MyNewImporter(AbstractImporter):

    def __init__(self):
        super().__init__()
        # super().__init__(queue=True)   # if it's an one-time run importer
        self.logger.info(f'Importer {self.name} initialized')

    def importer(self, my_var):
        # Process my_var and get content to import
        content = GET_MY_CONTENT_TO_IMPORT
        # if content is not gzipped and/or not b64 encoded,
        # set gzipped and/or b64 to False
        message = self.create_message(item_id, content, b64=False, gzipped=False)
        return message
        # if it's an one-time run, otherwise create an AIL Module
        # self.add_message_to_queue(message)

class MyNewModuleImporter(AbstractModule):
    def __init__(self):
        super().__init__()
        # init module ...
        self.importer = MyNewImporter()

    def get_message(self):
        return self.importer.importer()

    def compute(self, message):
        self.add_message_to_queue(message)

if __name__ == '__main__':
    module = MyNewModuleImporter()
    module.run()

    # if it's an one-time run:
    # importer = MyImporter()
    # importer.importer(my_var)

AIL Feeders

AIL Feeders are a special type of Importer within AIL, specifically designed to extract and feed data from external sources into the framework.

Extract Data: AIL Feeders extract data from external sources, such as APK files, certificate transparency logs, GitHub archives, repositories, ActivityPub sources, leaked files, Atom/RSS feeds, JSON logs, Discord, and Telegram, ...
Run Independently: Feeders can run on separate systems or infrastructure, providing flexibility and scalability. They operate independently from the core AIL framework.
Internal Logic: Each feeder can implement its own custom logic and processing to extract and transform data and metadata from the source into JSON.
Push to AIL API: The generated JSON is then pushed to the AIL API for ingestion and further analysis within the AIL framework.

AIL Feeders List:

ail-feeder-apk: Pushes annotated APK to an AIL instance for yara detection.
ail-feeder-ct: AIL feeder for certificate transparency.
ail-feeder-github-gharchive: extract informations from GHArchive, collect and feed AIL
ail-feeder-github-repo: Pushes github repositories to AIL.
ail-feeder-activity-pub ActivityPub feeder.
ail-feeder-leak: Automates the process of feeding files to AIL, using data chunking to handle large files.
ail-feeder-atom-rss Atom and RSS feeder for AIL.
ail-feeder-jsonlogs Aggregate JSON log lines and pushes them to AIL.

AIL Chats Feeders List:

ail-feeder-discord Discord Feeder.
ail-feeder-telegram Telegram Channels and User Feeder.

Chats Message

Overview of the JSON fields used by the Chat feeder.

{
    "data": "New NFT Scam available,"
    "meta": {
        "chat": {
            "date": {
                "datestamp": "2023-01-10 08:19:16",
                "timestamp": 1673870217.0,
                "timezone": "UTC"
            },
            "icon": "AAAAAAAA",
            "id": 123456,
            "info": "",
            "name": "NFT legit",
            "subchannel": {
                "date": {
                    "datestamp": "2023-08-10 08:19:18",
                    "timestamp": 1691655558.0,
                    "timezone": "UTC"
                },
                "id": 285,
                "name": "Market"
            },
        },
        "date": {
            "datestamp": "2024-02-01 13:43:46",
            "timestamp": 1707139999.0,
            "timezone": "UTC"
        },
        "id": 16,
        "reply_to": {
            "message_id": 12
        },
        "sender": {
            "first_name": "nftmaster",
            "icon": "AAAAAAAA",
            "id": 5684,
            "info": "best legit NFT vendor",
            "username": "nft_best"
        },
        "type": "message"
    },
    "source": "ail_feeder_telegram",
    "source-uuid": "9cde0855-248b-4439-b964-0495b9b2b8bb"
}

1. "data"

Content of the message.

2. "meta"

Provides metadata about the message.

"type":
- Indicates the type of message. It can be either "message" or "image".
"id":
- The unique identifier of the message.
"date":
- Represents the timestamp of the message.
  - "datestamp": The date in the format "YYYY-MM-DD HH:MM:SS".
  - "timestamp": The timestamp representing the date and time.
  - "timezone": The timezone in which the date and time are specified (e.g., "UTC").
"reply_to":
- The unique identifier of a message to which this message is a reply (optional).
  - "message_id": The unique identifier of the replied message.
"sender":
- Contains information about the sender of the message.
  - "id": The unique identifier for the sender.
  - "info": Additional information about the sender (optional).
  - "username": The sender's username (optional).
  - "firstname": The sender's firstname (optional).
  - "lastname": The sender's lastname (optional).
  - "phone": The sender's phone (optional).
"chat":
- Contains information about the chat where the message was sent.
  - "date": The chat creation date.
    - "datestamp": The date in the format "YYYY-MM-DD HH:MM:SS".
    - "timestamp": The timestamp representing the date and time.
    - "timezone": The timezone in which the date and time are specified (e.g., "UTC").
  - "icon": The icon associated with the chat (optional).
  - "id": The unique identifier of the chat.
  - "info": Chat description/info (optional).
  - "name": The name of the chat.
  - "username": The username of the chat (optional).
  - "subchannel": If this message is posted in a subchannel within the chat (optional).
    - "date": The subchannel creation date.
      - "datestamp": The date in the format "YYYY-MM-DD HH:MM:SS".
      - "timestamp": The timestamp representing the date and time.
      - "timezone": The timezone in which the date and time are specified (e.g., "UTC").
    - "id": The unique identifier of the subchannel.
    - "name": The name of the subchannel (optional).

3. "source"

Indicates the feeder name.

4. "source-uuid"

The UUID associated with the source.

Example: Feeding AIL with Conti leaks

from pyail import PyAIL
pyail = PyAIL(URL, API_KEY, ssl=verifycert)

#. . . imports
#. . . setup code

for elem in sys.stdin:
    elm = json.loads(elem)
    content = elm['body']
    meta = {}
    meta['jabber:to'] = elm['to']
    meta['jabber:from'] = elm['from']
    meta['jabber:ts]' = elm['ts']
    pyail.feed_json_item(content , meta, feeder_name, feeder_uuid)

AIL SYNC

The synchronisation mechanism allow the sync from one AIL instance to another AIL using a standard WebSocket using AIL JSON protocol. The synchronisation allows to filter and sync specific collected items including crawled items or specific tagged items matching defined rules. This feature can be very useful to limit the scope of analysis in specific fields or resource intensive activity. This sync can be also used to share filtered streams with other partners.

README.md

AIL objects

AIL Importers

pystemon:

File Importer importer/FileImporter.py:

Create a New Importer:

AIL Feeders

AIL Feeders List:

AIL Chats Feeders List:

Chats Message

1. "data"

2. "meta"

"type":

"id":

"date":

"reply_to":

"sender":

"chat":

3. "source"

4. "source-uuid"

Example: Feeding AIL with Conti leaks

AIL SYNC

File Importer `importer/FileImporter.py`: