2021-09-07 05:29:38 +00:00
# CPE guesser
2024-11-23 16:09:21 +00:00
CPE Guesser is a command-line tool or web service designed to guess the CPE name based on one or more keywords. The resulting CPE can then be used with tools like [cve-search ](https://github.com/cve-search/cve-search ) or [vulnerability-lookup ](https://github.com/cve-search/vulnerability-lookup ) to perform actual searches using CPE names.
2021-09-07 05:29:38 +00:00
## Requirements
2024-11-23 14:57:18 +00:00
- [Valkey ](https://valkey.io/ )
2021-09-07 05:29:38 +00:00
- Python
2021-09-16 18:23:33 +00:00
## Usage
2024-11-23 16:09:21 +00:00
To use CPE Guesser, you need to initialize the [Valkey ](https://valkey.io/ ) database with `import.py` .
2023-07-08 18:51:47 +00:00
2024-11-23 16:09:21 +00:00
Once initialized, you can use the software with `lookup.py` to find the most probable CPE matching the provided keywords.
2023-07-08 18:51:47 +00:00
2024-11-23 16:09:21 +00:00
Alternatively, you can call the web server (after running `server.py` ). For example:
```bash
curl -s -X POST http://localhost:8000/search -d '{"query": ["tomcat"]}' | jq .
```
2021-09-16 18:23:33 +00:00
2021-09-20 20:15:22 +00:00
### Installation
2024-04-06 06:12:56 +00:00
1. `git clone https://github.com/cve-search/cpe-guesser.git`
2. `cd cpe-guesser`
3. Download the CPE dictionary & populate the database with `python3 ./bin/import.py` .
4. Take a cup of black or green tea ().
5. `python3 ./bin/server.py` to run the local HTTP server.
2021-09-20 20:15:22 +00:00
2024-04-06 06:16:07 +00:00
If you don't want to install it locally, there is a public online version. Check below.
2021-09-20 20:15:22 +00:00
2023-07-08 18:51:47 +00:00
### Docker
2024-11-23 14:57:18 +00:00
#### Single image with existing Valkey
2023-07-08 18:51:47 +00:00
```bash
docker build . -t cpe-guesser:l.0
# Edit settings.yaml content and/or path
docker run cpe-guesser:l.0 -v $(pwd)/config/settings.yaml:/app/config/settings.yaml
# Please wait for full import
```
#### Docker-compose
```bash
cd docker
# Edit docker/settings.yaml as you want
docker-compose up --build -d
# Please wait for full import
```
#### Specific usage
If you do not want to use the Web server, `lookup.py` can still be used. Example: `docker exec -it cpe-guesser python3 /app/bin/lookup.py tomcat`
2021-09-20 20:15:22 +00:00
## Public online version
2021-09-20 15:07:11 +00:00
[cpe-guesser.cve-search.org ](https://cpe-guesser.cve-search.org ) is public online version of CPE guesser which can be used via
a simple API. The endpoint is `/search` and the JSON is composed of a query list with the list of keyword(s) to search for.
2024-04-06 06:16:07 +00:00
```bash
2021-09-20 15:07:11 +00:00
curl -s -X POST https://cpe-guesser.cve-search.org/search -d "{\"query\": [\"outlook\", \"connector\"]}" | jq .
2024-04-06 06:16:07 +00:00
```
```json
2021-09-20 15:07:11 +00:00
[
[
18117,
"cpe:2.3:a:microsoft:outlook_connector"
],
[
60947,
"cpe:2.3:a:oracle:oracle_communications_unified_communications_suite_connector_for_microsoft_outlook"
],
[
68306,
"cpe:2.3:a:oracle:corporate_time_outlook_connector"
]
]
2024-04-06 06:16:07 +00:00
```
2021-09-20 15:07:11 +00:00
2024-11-23 16:09:21 +00:00
The endpoint `/unique` is available to retrieve only the best-matching CPE entry.
```bash
curl -s -X POST https://cpe-guesser.cve-search.org/unique -d "{\"query\": [\"outlook\", \"connector\"]}" | jq .
```
```json
"cpe:2.3:a:oracle:corporate_time_outlook_connector"
```
2021-09-16 18:23:33 +00:00
### Command line - `lookup.py`
2024-04-06 06:16:07 +00:00
```text
2024-11-23 16:09:21 +00:00
usage: lookup.py [-h] [--unique] WORD [WORD ...]
2021-09-16 18:23:33 +00:00
Find potential CPE names from a list of keyword(s) and return a JSON of the results
2021-09-22 14:16:23 +00:00
positional arguments:
WORD One or more keyword(s) to lookup
2024-11-23 16:09:21 +00:00
options:
2021-09-22 14:16:23 +00:00
-h, --help show this help message and exit
2024-11-23 16:09:21 +00:00
--unique Return the best CPE matching the keywords given
2024-04-06 06:16:07 +00:00
```
2021-09-16 18:23:33 +00:00
2024-04-06 06:16:07 +00:00
```bash
2021-09-22 14:16:23 +00:00
python3 lookup.py microsoft sql server | jq .
2024-04-06 06:16:07 +00:00
```
```json
2021-09-16 18:23:33 +00:00
[
[
2021-09-22 14:16:23 +00:00
51325,
2021-09-16 18:23:33 +00:00
"cpe:2.3:a:microsoft:sql_server_2017_reporting_services"
],
[
2021-09-22 14:16:23 +00:00
51326,
2021-09-16 18:23:33 +00:00
"cpe:2.3:a:microsoft:sql_server_2019_reporting_services"
],
[
2021-09-22 14:16:23 +00:00
57898,
2021-09-16 18:23:33 +00:00
"cpe:2.3:a:quest:intrust_knowledge_pack_for_microsoft_sql_server"
],
[
2021-09-22 14:16:23 +00:00
60386,
2021-09-16 18:23:33 +00:00
"cpe:2.3:o:microsoft:sql_server"
],
[
2021-09-22 14:16:23 +00:00
60961,
2021-09-16 18:23:33 +00:00
"cpe:2.3:a:microsoft:sql_server_desktop_engine"
],
[
2021-09-22 14:16:23 +00:00
64810,
2021-09-16 18:23:33 +00:00
"cpe:2.3:a:microsoft:sql_server_reporting_services"
],
[
2021-09-22 14:16:23 +00:00
75858,
2021-09-16 18:23:33 +00:00
"cpe:2.3:a:microsoft:sql_server_management_studio"
],
[
2021-09-22 14:16:23 +00:00
77570,
2021-09-16 18:23:33 +00:00
"cpe:2.3:a:microsoft:sql_server"
],
[
2021-09-22 14:16:23 +00:00
78206,
2021-09-16 18:23:33 +00:00
"cpe:2.3:a:ibm:tivoli_storage_manager_for_databases_data_protection_for_microsoft_sql_server"
]
]
2024-04-06 06:16:07 +00:00
```
2021-09-16 18:23:33 +00:00
2021-09-07 05:29:38 +00:00
## How does this work?
A CPE entry is composed of a human readable name with some references and the structured CPE name.
2024-04-06 06:16:07 +00:00
```xml
2021-09-07 05:29:38 +00:00
< cpe-item name = "cpe:/a:10web:form_maker:1.7.17::~~~wordpress~~" >
< title xml:lang = "en-US" > 10web Form Maker 1.7.17 for WordPress< / title >
< references >
< reference href = "https://wordpress.org/plugins/form-maker/#developers" > Change Log< / reference >
< / references >
< cpe-23:cpe23-item name = "cpe:2.3:a:10web:form_maker:1.7.17:*:*:*:*:wordpress:*:*" / >
< / cpe-item >
2024-04-06 06:16:07 +00:00
```
2021-09-07 05:29:38 +00:00
The CPE name is structured with a vendor name, a product name and some additional information.
CPE name can be easily changed due to vendor name or product name changes, some vendor/product are
sharing common names or name is composed of multiple words.
### Data
Split vendor name and product name (such as `_` ) into single word(s) and then canonize the word. Building an inverse index using
2024-04-06 06:16:07 +00:00
the cpe vendor:product format as value and the canonized word as key. Then cpe guesser creates a ranked set with the most common
2024-11-23 14:57:18 +00:00
cpe (vendor:product) per version to give a probability of the CPE appearance.
2021-09-07 05:29:38 +00:00
2024-11-23 14:57:18 +00:00
### Valkey structure
2021-09-07 05:29:38 +00:00
- `w:<word>` set
- `s:<word>` sorted set with a score depending of the number of appearance
2021-09-16 18:23:33 +00:00
2024-04-06 06:16:07 +00:00
## License
2021-09-16 18:23:33 +00:00
Software is open source and released under a 2-Clause BSD License
2024-11-23 14:57:18 +00:00
Copyright (C) 2021-2024 Alexandre Dulaunoy
Copyright (C) 2021-2024 Esa Jokinen