Python Forum
Converting several Markdown files into DOCX using Pandoc
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Converting several Markdown files into DOCX using Pandoc
#1
Hey Guys,
I have a folder called Project, this is the structure:

Project:

img
docs
README

The README is written in Markdown and looks like this:

## 1. [xxxxxxxx](docs/1.md)
## 1.1 [xxxxxxxx](docs/11.md)
## 2. [xxxxxxxx](docs/2.md)
### 2.1 [xxxxxxxx](docs/21.md)
### 2.2 [xxxxxxxx](docs/22.md)
#### 2.2.1 [xxxxxxxx](docs/221.md)
#### 2.2.2 [xxxxxxxx](docs/222.md)
### 2.3 [xxxxxxxx](docs/23.md)
### 2.4. [xxxxxxxx](docs/24.md)
### 2.5. [xxxxxxxx](docs/25.md)
### 2.6. [xxxxxxxx](docs/26.md)
## 3. [xxxxxxxx](docs/3.md)
### 3.1. [xxxxxxxx](docs/31.md)
#### 3.1.1 [xxxxxxxx](docs/311.md)
#### 3.1.2 [xxxxxxxx](docs/312.md)
#### 3.1.3 [xxxxxxxx](docs/313.md)
#### 3.1.4 [xxxxxxxx](docs/314.md)
### 3.2 [xxxxxxxx](docs/32.md)
#### 3.2.1 [xxxxxxxx](docs/321.md)
#### 3.2.2. [xxxxxxxx](docs/322.md)
### 3.3. [xxxxxxxx](docs/33.md)
#### 3.3.1. [xxxxxxxx](/docs/331.md)
#### 3.3.2. [xxxxxxxx](/docs/322.md)
## 4. [xxxxxxxx](/docs/4.md)
### 4.1 [xxxxxxxx](docs/41.md)
### 4.2 [xxxxxxxx](docs/42.md)
### 4.3 [xxxxxxxx](docs/43.md)
### 4.4 [xxxxxxxx](docs/44.md)
## 5. [xxxxxxxx](/docs/5.md)
## 6. [Exxxxxxxx](/docs/6.md)

## [xxxxxxxx](docs/a_shorts.en)

The readme has links to the individual Markdown files. The file structure in the document folder looks like this:

1.md
2.md
3.md
4.md
5.md
6.md
11.md
21.md
22.md
23.md
......
211.md
..
a_shorts.md

The documents have images that come from the img folder and have cross references to each other e.g. 1.md has a reference to 11.md .

How can I use a pandoc command to convert the documents as they appear in order in the README to a DOCX file, with the corresponding images and the cross references in the respective Markdown files. Is there a suitable script with pandoc for this?
I had tried it once with this Python script:
import subprocess
import re


with open("README.md", "r") as f:
    content = f.read()
docs_order = re.findall(r'document [0-9]+((\.[0-9]+)*)', content)

# Convert each document in the specified order
for doc in docs_order:
    filename = "docs/{}.md".format(doc.replace(".", ""))
    output_file = "{}.docx".format(filename[:-3])
    result = subprocess.run(["pandoc", "-s", "-o", output_file, filename])
I moved the script to the project folder and ran it from there. However, I did not get any output or error message at all. What did I do wrong?
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  no module named 'docx' when importing docx MaartenRo 1 705 Dec-31-2023, 11:21 AM
Last Post: deanhystad
  python-docx regex: replace any word in docx text Tmagpy 4 2,139 Jun-18-2022, 09:12 AM
Last Post: Tmagpy
  Strange Pandoc Error AreebSooYasir 0 2,013 Jul-24-2018, 04:30 AM
Last Post: AreebSooYasir
  Converting units in NetCdf Files in Python fyec 1 4,081 Jun-18-2018, 12:53 PM
Last Post: gontajones

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020