Write README.md

This commit is contained in:
Kiryl
2022-09-12 15:24:41 +03:00
parent ea13d38f27
commit 317d040a06

View File

@@ -1,7 +1,25 @@
# About <h1 align="center"> Converter </h1> <br>
<p align="center">
<a href="https://livecarta.com/">
<img alt="LiveCarta converter" title="LiveCarta converter" src="https://assets.openstax.org/oscms-prodcms/media/partner_logos/LiveCarta_Logo.png" width="450">
</a>
</p>
This repository contains code related to docx/epub files conversion to livecarta inner format. <!-- START doctoc generated TOC please keep comment here to allow auto update -->
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
## Table of Contents
- [Introduction](#introduction)
- [Features](#features)
- [Top level project structure](#top level project structure)
- [How it Works](#how it works)
- [Setup](#setup)
- [Development](#development)
- [How to use](#how to use)
<!-- END doctoc generated TOC please keep comment here to allow auto update -->
## Introduction
This is a Python 3 project for converting Docx|Epub documents -> LiveCarta inner format.
Livecarta book format is tree structure, where nodes are chapters. Livecarta book format is tree structure, where nodes are chapters.
Livecarta chapter is title + html code. Livecarta html code follows some restrictions: Livecarta chapter is title + html code. Livecarta html code follows some restrictions:
@@ -12,10 +30,57 @@ Livecarta chapter is title + html code. Livecarta html code follows some restric
- Styles are added as _inline_, i.e. attribute `style` in html tag. - Styles are added as _inline_, i.e. attribute `style` in html tag.
- Each tag has its own restrictions on attributes and style. See doc/style_config - Each tag has its own restrictions on attributes and style. See doc/style_config
## Features
- Converts Epub, Docx to JSON(LiveCarta inner format)
- Compatible with python 3
- Very small size (only .py files)
- Multithreaded
# Top level project structure ## Top level project structure
- `consumer.py` - code which is responsible for receiving messages from rabbitMQ - `consumer.py` - code which is responsible for receiving messages from rabbitMQ
- class `Access` - contains API code which is responsible for interaction with server. - class `Access` - contains API code which is responsible for interaction with server.
- class `Solver` - contains code responsible for pipeline of solving the task: receiving book file, conversion, status updating, sending result back to server. - class `Solver` - contains code responsible for pipeline of solving the task: receiving book file, conversion, status updating, sending result back to server.
- `livecarta_config.py `- constants that depend on LiveCarta - `livecarta_config.py `- constants that depend on LiveCarta
## How it Works
**2 approaches** in 3 steps each works:
#### Epub
***Step 1*** - Add CSS to HTML inline_style
**Step 2** - Process every HTML chapter of Epub with presets
**Step 3** - Convert dicts of HTML to JSON(LiveCarta inner format)
#### Docx
**Step 1** - Conversion of DOCX to HTML via LibreOffice
**Step 2** - Process HTML with presets
**Step 3** - Conversion of HTML to JSON(LiveCarta inner format)
## Setup
python -m pip install -r requirements.txt
### Development
To fix a bug or enhance an existing module, follow these steps:
- Fork the repo
- Create a new branch (`git checkout -b improve-feature`)
- Make the appropriate changes in the files
- Add changes to reflect the changes made
- Commit your changes (`git commit -am 'Improve feature'`)
- Push to the branch (`git push origin improve-feature`)
- Create a Pull Request
## How to Use
**1.** Run `consumer.py`
The script will be constantly waiting for a message from the queue(RabbitMQ), into which we load the book via Import File to Convert in the admin panel
You can also upload the book that have been converted locally using `def local_convert()` in `consumer.py`
**b.** Run `docx_solver.py`
1. You need to run it on Linux system, but if u're using Windows - just using python docker intepreter
2. Upload a book to books/docx/ and set the variable `docx_file_path = books/docx/book_name` in __main__
**c.** Run `epub_solver.py`
Before that upload a book to books/epub/ and set the variable `epub_file_path = books/epub/book_name` in __main__