diff --git a/README.md b/README.md index b0faa3d..7cadc5d 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,25 @@ -# About +

Converter


+

+ + LiveCarta converter + +

-This repository contains code related to docx/epub files conversion to livecarta inner format. + + +## Table of Contents +- [Introduction](#introduction) +- [Features](#features) +- [Top level project structure](#top level project structure) +- [How it Works](#how it works) +- [Setup](#setup) + - [Development](#development) +- [How to use](#how to use) + + +## Introduction +This is a Python 3 project for converting Docx|Epub documents -> LiveCarta inner format. Livecarta book format is tree structure, where nodes are chapters. Livecarta chapter is title + html code. Livecarta html code follows some restrictions: @@ -12,10 +30,57 @@ Livecarta chapter is title + html code. Livecarta html code follows some restric - Styles are added as _inline_, i.e. attribute `style` in html tag. - Each tag has its own restrictions on attributes and style. See doc/style_config +## Features +- Converts Epub, Docx to JSON(LiveCarta inner format) +- Compatible with python 3 +- Very small size (only .py files) +- Multithreaded -# Top level project structure - +## Top level project structure - `consumer.py` - code which is responsible for receiving messages from rabbitMQ - class `Access` - contains API code which is responsible for interaction with server. - class `Solver` - contains code responsible for pipeline of solving the task: receiving book file, conversion, status updating, sending result back to server. -- `livecarta_config.py `- constants that depend on LiveCarta \ No newline at end of file +- `livecarta_config.py `- constants that depend on LiveCarta + +## How it Works +**2 approaches** in 3 steps each works: +#### Epub +***Step 1*** - Add CSS to HTML inline_style + +**Step 2** - Process every HTML chapter of Epub with presets + +**Step 3** - Convert dicts of HTML to JSON(LiveCarta inner format) + +#### Docx +**Step 1** - Conversion of DOCX to HTML via LibreOffice + +**Step 2** - Process HTML with presets + +**Step 3** - Conversion of HTML to JSON(LiveCarta inner format) + +## Setup + + python -m pip install -r requirements.txt + +### Development +To fix a bug or enhance an existing module, follow these steps: + +- Fork the repo +- Create a new branch (`git checkout -b improve-feature`) +- Make the appropriate changes in the files +- Add changes to reflect the changes made +- Commit your changes (`git commit -am 'Improve feature'`) +- Push to the branch (`git push origin improve-feature`) +- Create a Pull Request + +## How to Use +**1.** Run `consumer.py` +The script will be constantly waiting for a message from the queue(RabbitMQ), into which we load the book via Import File to Convert in the admin panel +You can also upload the book that have been converted locally using `def local_convert()` in `consumer.py` + +**b.** Run `docx_solver.py` +1. You need to run it on Linux system, but if u're using Windows - just using python docker intepreter +2. Upload a book to books/docx/ and set the variable `docx_file_path = books/docx/book_name` in __main__ + +**c.** Run `epub_solver.py` +Before that upload a book to books/epub/ and set the variable `epub_file_path = books/epub/book_name` in __main__