chg: [HOWTO] improve HOWTO

This commit is contained in:
Terrtia 2023-05-30 14:48:06 +02:00
parent 2ebe4845a7
commit 50abff66b4
No known key found for this signature in database
GPG key ID: 1E1B1F50D84613D0
3 changed files with 46 additions and 48 deletions

View file

@ -1,17 +1,15 @@
Feeding, adding new features and contributing # Feeding, adding new features and contributing
=============================================
How to feed the AIL framework ## How to feed the AIL framework
-----------------------------
For the moment, there are three different ways to feed AIL with data: Currently, there are three different ways to feed data into AIL:
1. Be a collaborator of CIRCL and ask to access our feed. It will be sent to the static IP you are using for AIL. 1. Be a collaborator of CIRCL and ask to access our feed. It will be sent to the static IP you are using for AIL.
2. You can setup [pystemon](https://github.com/cvandeplas/pystemon) and use the custom feeder provided by AIL (see below). 2. You can setup [pystemon](https://github.com/cvandeplas/pystemon) and use the custom feeder provided by AIL (see below).
3. You can feed your own data using the [./bin/file_dir_importer.py](./bin/import_dir.py) script. 3. You can feed your own data using the [./tool/file_dir_importer.py](./tool/file_dir_importer.py) script.
### Feeding AIL with pystemon ### Feeding AIL with pystemon
@ -21,10 +19,12 @@ However, if you want to collect some pastes and feed them to AIL, the procedure
Feed data to AIL: Feed data to AIL:
1. Clone the [pystemon's git repository](https://github.com/cvandeplas/pystemon): 1. Clone the [pystemon's git repository](https://github.com/cvandeplas/pystemon):
``` git clone https://github.com/cvandeplas/pystemon.git ``` ```
git clone https://github.com/cvandeplas/pystemon.git
```
2. Edit configuration file for pystemon ```pystemon/pystemon.yaml```: 2. Edit configuration file for pystemon ```pystemon/pystemon.yaml```:
* Configuration of storage section (adapt to your needs): - Configure the storage section according to your needs:
``` ```
storage: storage:
archive: archive:
@ -44,68 +44,61 @@ Feed data to AIL:
database: 10 database: 10
lookup: no lookup: no
``` ```
* Change configuration for paste-sites according to your needs (don't forget to throttle download time and/or update time). - Adjust the configuration for paste-sites based on your requirements (remember to throttle download and update times).
3. Install python dependencies inside the virtual environment: 3. Install python dependencies inside the virtual environment:
``` ```shell
cd ail-framework/ cd ail-framework/
. ./AILENV/bin/activate . ./AILENV/bin/activate
cd pystemon/ #cd to pystemon folder cd pystemon/
pip3 install -U -r requirements.txt pip install -U -r requirements.txt
``` ```
4. Edit configuration file ```ail-framework/configs/core.cfg```: 4. Edit the configuration file ```ail-framework/configs/core.cfg```:
* Modify the "pystemonpath" path accordingly - Modify the "pystemonpath" path accordingly.
5. Launch ail-framework, pystemon and pystemon-feeder.py (still inside virtual environment): 5. Launch ail-framework, pystemon and PystemonImporter.py (all within the virtual environment):
* Option 1 (recommended): - Option 1 (recommended):
```
./ail-framework/bin/LAUNCH.py -l #starts ail-framework
./ail-framework/bin/LAUNCH.py -f #starts pystemon and the pystemon-feeder.py
``` ```
* Option 2 (you may need two terminal windows):
```
./ail-framework/bin/LAUNCH.py -l #starts ail-framework ./ail-framework/bin/LAUNCH.py -l #starts ail-framework
./pystemon/pystemon.py ./ail-framework/bin/LAUNCH.py -f #starts pystemon and the PystemonImporter.py
./ail-framework/bin/feeder/pystemon-feeder.py ```
``` - Option 2 (may require two terminal windows):
```
./ail-framework/bin/LAUNCH.py -l #starts ail-framework
./pystemon/pystemon.py
./ail-framework/bin/importer/PystemonImporter.py
```
How to create a new module ## How to create a new module
--------------------------
If you want to add a new processing or analysis module in AIL, follow these simple steps: To add a new processing or analysis module to AIL, follow these steps:
1. Add your module name in [./bin/packages/modules.cfg](./bin/packages/modules.cfg) and subscribe to at least one module at minimum (Usually, Redis_Global). 1. Add your module name in [./configs/modules.cfg](./configs/modules.cfg) and subscribe to at least one module at minimum (Usually, `Item`).
2. Use [./bin/template.py](./bin/template.py) as a sample module and create a new file in bin/ with the module name used in the modules.cfg configuration. 2. Use [./bin/modules/modules/TemplateModule.py](./bin/modules/modules/TemplateModule.py) as a sample module and create a new file in bin/modules with the module name used in the `modules.cfg` configuration.
How to contribute a module ## How to contribute a module
--------------------------
Feel free to fork the code, play with it, make some patches or add additional analysis modules. Feel free to fork the code, play with it, make some patches or add additional analysis modules.
To contribute your module, feel free to pull your contribution. To contribute your module, feel free to pull your contribution.
Additional information ## Additional information
======================
Crawler ### Crawler
---------------------
In AIL, you can crawl websites and Tor hidden services. Don't forget to review the proxy configuration of your Tor client and especially if you enabled the SOCKS5 proxy In AIL, you can crawl websites and Tor hidden services. Don't forget to review the proxy configuration of your Tor client and especially if you enabled the SOCKS5 proxy
[//]: # (and binding on the appropriate IP address reachable via the dockers where Splash runs.)
### Installation ### Installation
[Install Lacus](https://github.com/ail-project/lacus) [Install Lacus](https://github.com/ail-project/lacus)
### Configuration ### Configuration
1. Lacus URL: 1. Lacus URL:
In the webinterface, go to ``Crawlers>Settings`` and click on the Edit button In the web interface, go to `Crawlers` > `Settings` and click on the Edit button
![Splash Manager Config](./doc/screenshots/lacus_config.png?raw=true "AIL Lacus Config") ![Splash Manager Config](./doc/screenshots/lacus_config.png?raw=true "AIL Lacus Config")
@ -115,10 +108,11 @@ In the webinterface, go to ``Crawlers>Settings`` and click on the Edit button
Choose the number of crawlers you want to launch Choose the number of crawlers you want to launch
![Splash Manager Nb Crawlers Config](./doc/screenshots/crawler_nb_captures.png?raw=true "AIL Lacus Nb Crawlers Config") ![Splash Manager Nb Crawlers Config](./doc/screenshots/crawler_nb_captures.png?raw=true "AIL Lacus Nb Crawlers Config")
![Splash Manager Nb Crawlers Config](./doc/screenshots/crawler_nb_captures_edit.png?raw=true "AIL Lacus Nb Crawlers Config") ![Splash Manager Nb Crawlers Config](./doc/screenshots/crawler_nb_captures_edit.png?raw=true "AIL Lacus Nb Crawlers Config")
Kvrocks Migration ### Kvrocks Migration
--------------------- ---------------------
**Important Note: **Important Note:
We are currently working on a [migration script](https://github.com/ail-project/ail-framework/blob/master/bin/DB_KVROCKS_MIGRATION.py) to facilitate the migration to Kvrocks. We are currently working on a [migration script](https://github.com/ail-project/ail-framework/blob/master/bin/DB_KVROCKS_MIGRATION.py) to facilitate the migration to Kvrocks.
@ -130,12 +124,12 @@ Please note that the current version of this migration script only supports migr
To migrate your database to Kvrocks: To migrate your database to Kvrocks:
1. Launch ARDB and Kvrocks 1. Launch ARDB and Kvrocks
2. Pull from remote 2. Pull from remote
``` ```shell
git checkout master git checkout master
git pull git pull
``` ```
3. Launch the migration script: 3. Launch the migration script:
``` ```shell
git checkout master git checkout master
git pull git pull
cd bin/ cd bin/

View file

@ -30,15 +30,15 @@ class Template(AbstractModule):
def __init__(self): def __init__(self):
super(Template, self).__init__() super(Template, self).__init__()
# Pending time between two computation (computeNone) in seconds # Pending time between two computation (computeNone) in seconds, 10 by default
self.pending_seconds = 10 # self.pending_seconds = 10
# Send module state to logs # logs
self.logger.info(f'Module {self.module_name} initialized') self.logger.info(f'Module {self.module_name} initialized')
# def computeNone(self): # def computeNone(self):
# """ # """
# Do something when there is no message in the queue # Do something when there is no message in the queue. Optional
# """ # """
# self.logger.debug("No message in queue") # self.logger.debug("No message in queue")
@ -53,6 +53,5 @@ class Template(AbstractModule):
if __name__ == '__main__': if __name__ == '__main__':
module = Template() module = Template()
module.run() module.run()

View file

@ -168,4 +168,9 @@ subscribe = Url
# [My_Module_Name] # [My_Module_Name]
# subscribe = Global # Queue name # subscribe = Global # Queue name
# publish = Tags # Queue name # publish = Tags # Queue name
#
# [TemplateModule.]
# subscribe = Global # Queue name
# publish = Tags # Queue name