forked from LiveCarta/LiveCartaMeta
This repository has been archived on 2026-04-06 . You can view files and clone it. You cannot open issues or pull requests or push a commit.
217d63a3c0f9c7e9a52f3113eaa5f5eb823386ff
Book Meta Data Parser
Microservice which solves only one issue – parse book meta data from our publishers. Not depends on what format publisher stores this data, the service must grub this information and send an array of data to the main application without any formatting. The main idea is to add components for parsing different formats and have the ability to add publishers just by updating config files.
Version 1.0
Added two components for working with CSV and FTP.
Tech Stack
• Docker
• Python 3.11
• MongoDb 6.0.2
• Dynaconf
• Pydantic
• MongoEngine
Folder structure
• app
◦ components
◦ configs
▪ configs.py – keys and url for connection to our main app and creds for service db
▪ main.json – main config
▪ sources.json – list of sources with components that they use
◦ models
◦ sources
▪ file_types
▪ source_types
Sources configuration
To configure a new source you need to update source config by adding the params below:
• source_name
• source //with neccesary params for component
• parser_type //with neccesary params for component
Example for CSV files from FTP:
{
"sources": {
"McGrawHill": {
"source_name": "McGrawHill",
"source": {
"type": "ftp",
"ftp_url": "127.0.0.1",
"ftp_login": "frp_login",
"ftp_password": "frp_pass",
"local_files_path": "/app/files/McGrawHill/",
"file_regex": "*.csv"
},
"parser_type": {
"format": "csv"
}
}
}
}
Each source parser starts by crontab by command
python update.py {source_name}
To see list of source types use command
python update.py -h
Run Updates
Copy .env.sample to .env and update settings
Description
Languages
Python
99%
Dockerfile
1%