Resume Parsing Automation Workflow

Description

Overview

This resume parsing automation workflow streamlines the extraction and formatting of resume data from PDF files into structured HTML and PDF outputs. Utilizing an event-driven analysis pipeline triggered by Telegram messages, it converts unstructured resume PDFs into categorized, machine-readable sections through a no-code integration with OpenAI’s GPT-4 Turbo model.

The workflow is designed for recruiters, HR professionals, and developers seeking deterministic extraction of personal, educational, and professional data from resumes. It initiates with a Telegram trigger node that listens for incoming messages containing PDF resumes.

Key Benefits

Automates resume data extraction using an orchestration pipeline with GPT-4 Turbo for structured JSON output.
Transforms extracted data into comprehensive HTML sections for easy formatting and review.
Generates PDF documents from HTML using an integrated PDF conversion service for consistent delivery.
Implements authorization checks to process resumes only from approved Telegram users.

Product Overview

This automation workflow is triggered by incoming Telegram messages containing resume files in PDF format. Upon detection, the Telegram trigger node captures the message and passes it through an authorization node that validates the user’s chat ID against a predefined allowed ID. Unauthorized messages terminate the flow.

Once authorized, the workflow downloads the PDF file from Telegram using the file ID, then extracts readable text content with a dedicated PDF extraction node. This extracted text is sent to the OpenAI Chat Model node configured with GPT-4 Turbo Preview and a zero temperature setting for deterministic output. The model is prompted to analyze the resume text and return structured data conforming to a detailed JSON schema covering personal info, employment history, education, projects, volunteering, programming languages, and more.

The output parser nodes validate and auto-correct the JSON structure to ensure data integrity. Following parsing, multiple code nodes convert each data section into formatted HTML strings. These sections are merged and concatenated into a single HTML document string, which is then base64 encoded and converted into an HTML file. This file is sent to a Gotenberg server for PDF conversion. The resulting PDF is delivered back to the Telegram user by sending it as a document message, completing a fully automated resume processing and delivery pipeline.

Features and Outcomes

Core Automation

The core automation workflow accepts PDF resumes via Telegram and applies event-driven analysis to extract structured data. It uses the OpenAI Chat Model node for JSON-based resume parsing and several code nodes for HTML formatting.

Single-pass evaluation of text extraction and JSON parsing using GPT-4 Turbo with temperature set to zero.
Deterministic branching based on authorization and message content to ensure secure processing.
Automated HTML assembly from multiple resume data categories for standardized output.

Integrations and Intake

This no-code integration pipeline connects Telegram for intake, OpenAI GPT-4 Turbo for text analysis, and Gotenberg for PDF generation. The Telegram nodes utilize API credentials for secure file retrieval.

Telegram Trigger node listens for message updates containing PDF documents.
OpenAI Chat Model node processes extracted text with API key-based authentication.
HTTP Request node interfaces with Gotenberg service to convert HTML to PDF synchronously.

Outputs and Consumption

The workflow produces well-structured HTML and PDF documents representing parsed resume data. Outputs are delivered synchronously to the Telegram user as a PDF file containing formatted personal, professional, and educational details.

HTML sections include personal info, employment history, education, projects, volunteering, and technologies.
Final output is a PDF document generated via Gotenberg from the compiled HTML.
PDF is sent directly to the user’s Telegram chat using Telegram’s sendDocument operation.

Workflow — End-to-End Execution

Step 1: Trigger

The process initiates upon receiving a Telegram message update containing a document. The Telegram trigger node listens for “message” events, capturing the incoming resume PDF file uploaded by the user. Authorization is checked by comparing the sender’s chat ID to an allowed value; unauthorized requests terminate early. The workflow ignores the initial “/start” command message to prevent processing irrelevant triggers.

Step 2: Processing

The workflow downloads the PDF file using the Telegram file ID and extracts its textual content via a PDF extraction node. Basic presence checks ensure that the file and text are available. The extracted plain text is forwarded to the OpenAI Chat Model node for structured parsing.

Step 3: Analysis

The OpenAI Chat Model node runs with temperature set to 0 and JSON response format, enforcing deterministic parsing of the resume text into a JSON object adhering to a specified schema. The auto-fixing output parser corrects minor inconsistencies. The structured output parser node validates the JSON against the detailed schema, confirming data types and required fields.

Step 4: Delivery

Parsed JSON sections are individually converted to HTML strings by dedicated code nodes. These HTML segments are merged into a comprehensive document and base64 encoded. The encoded HTML is converted into a binary HTML file, then synchronously posted to a Gotenberg server for PDF generation. The resulting PDF file is sent back to the user on Telegram as a document message using the original chat ID.

Use Cases

Scenario 1

Recruiters receive numerous resume PDFs daily, requiring time-consuming manual data extraction. This automation workflow converts uploaded resumes into structured JSON and formatted PDFs automatically, reducing manual effort and ensuring consistent data formatting for easier candidate evaluation.

Scenario 2

HR teams need a reliable way to standardize resume information for applicant tracking systems. By integrating Telegram intake and GPT-based parsing, the workflow outputs validated, well-organized HTML and PDF files, facilitating downstream processing and record keeping.

Scenario 3

Developers building recruitment chatbots require deterministic extraction of resume details without custom coding. This no-code integration pipeline accepts user-uploaded PDFs, parses data with GPT, and returns formatted documents, enabling seamless chatbot resume handling with minimal configuration.

How to use

To deploy this resume parsing automation workflow, import it into your n8n instance and configure Telegram and OpenAI API credentials. Set the authorized Telegram chat ID in the Auth node to restrict access. Ensure a local or accessible Gotenberg server is available for PDF conversion. Once live, upload a resume PDF to the connected Telegram bot; the workflow extracts, formats, converts, and returns a PDF summary directly to the user’s chat. Expect structured personal and professional sections formatted in HTML and a consolidated PDF output for download.

Comparison — Manual Process vs. Automation Workflow

Attribute	Manual/Alternative	This Workflow
Steps required	Multiple manual steps: file download, reading, parsing, formatting, PDF creation	Single automated pipeline from file upload to PDF delivery
Consistency	Varies by user skill and attention; prone to errors	Deterministic JSON parsing and formatted output with schema validation
Scalability	Limited by human capacity and speed	Scales with n8n instance, handling multiple Telegram inputs simultaneously
Maintenance	High due to manual workflows and tool switching	Low; centralized workflow with configurable nodes and credential updates

Technical Specifications

Environment	n8n automation platform with Telegram and OpenAI API connectivity
Tools / APIs	Telegram Bot API, OpenAI GPT-4 Turbo, Gotenberg PDF conversion service
Execution Model	Event-driven, synchronous response for PDF generation and delivery
Input Formats	PDF files uploaded via Telegram messages
Output Formats	Structured JSON, formatted HTML, final PDF document
Data Handling	Transient processing of resume text; no persistent storage configured
Known Constraints	Requires valid API credentials, authorized Telegram chat ID, and accessible Gotenberg server
Credentials	Telegram API key, OpenAI API key

Implementation Requirements

Valid Telegram bot API credentials configured in n8n for message retrieval and file sending.
OpenAI API key with access to GPT-4 Turbo model for resume text parsing.
Accessible Gotenberg PDF conversion service endpoint for HTML-to-PDF conversion.

Configuration & Validation

Set authorized Telegram chat ID in the Auth node to restrict workflow execution.
Validate OpenAI API credentials and ensure model access with temperature set to 0 for deterministic output.
Confirm Gotenberg server accessibility by testing HTML-to-PDF conversion with sample HTML files.

Data Provenance

Initial trigger from Telegram trigger node captures incoming message events with document attachments.
OpenAI Chat Model node processes extracted text using GPT-4 Turbo with JSON response format and temperature 0.
Output parsers apply JSON schema validation and auto-fixing to ensure data structure integrity.

FAQ

How is the resume parsing automation workflow triggered?

The workflow is triggered by a Telegram trigger node that listens for incoming messages containing PDF files. Upon receiving a message with a document, the workflow initiates processing after authorization.

Which tools or models does the orchestration pipeline use?

The orchestration pipeline uses the OpenAI Chat Model node configured with GPT-4 Turbo model at zero temperature for deterministic resume data extraction. It also integrates Telegram API and Gotenberg PDF conversion service.

What does the response look like for client consumption?

The client receives a PDF document sent back to the Telegram chat, containing formatted sections such as personal information, employment history, education, projects, volunteering, and technologies.

Is any data persisted by the workflow?

No persistent storage is configured; all data processing is transient within the workflow execution context.

How are errors handled in this integration flow?

The workflow relies on n8n platform default error handling. There are no explicit retry or backoff strategies configured within the workflow nodes.

Conclusion

This resume parsing automation workflow provides a deterministic, end-to-end solution for extracting and formatting resume data from PDFs sent via Telegram. By leveraging GPT-4 Turbo for structured data extraction and integrating HTML-to-PDF conversion, it reduces manual processing steps and ensures consistent data output. The workflow requires valid API credentials and an accessible PDF conversion server, representing a dependency on external services for full operation. Its structured output and automated delivery offer reliable utility for HR and recruitment automation scenarios.

Additional information

Use Case	Education & Training, IT & Dev
Platform	n8n, OpenAI GPT
Risk Level (EU)	GPAI
Tech Stack	Custom API
Trigger Type	Chat Command, File Upload
Skill Level	Developer friendly, Low Code
Data Sensitivity	Contains PII, Highly Sensitive