I work at an accounting firm where I am currently tasked with handling the complex formatting, review, and correction of Word document reports. I have noticed that AI models are becoming increasingly powerful, so I tried using multimodal large models like Qwen3vl, Gemini, and Zhipu AI to parse Word documents. However, I found that these models can only recognize specific character content but not the document's formatting styles, such as font size, paragraph spacing, header and footer styles, table borders, and other styles. I hope to set up review prompts based on different types of reports, and the large models can parse the styles, analyze them, and finally provide conclusions and corrected documents. Later, I used py-docx to manually parse the Word formats. Due to my limited experience, some styles were not accurately set, and the program required setting many style review rules, making the entire process inflexible. Therefore, I hope to develop a workflow through AI that can intelligently review document styles and correct documents, helping business personnel work more efficiently. I hope to receive guidance from all the teachers. Thank you very much!

Xiaozhao's user avatar

New contributor

Xiaozhao is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.