Description
Overview
This domain extraction workflow provides precise domain parsing from URLs or email addresses, functioning as a reliable automation workflow for domain normalization. Utilizing a comprehensive list of valid top-level domains (TLDs), it accurately handles multi-level TLDs like “co.uk”, ensuring correct extraction of the registered domain from diverse input formats.
Designed for data engineers and automation specialists, this orchestration pipeline determines if the extracted domain corresponds to a known free mail provider, facilitating email domain categorization. The workflow is triggered via an execute workflow trigger node, initiating the domain extraction process.
Key Benefits
- Extracts registered domain names from URLs or email addresses with multi-level TLD support.
- Identifies whether the domain belongs to a free mail service using an extensive provider list.
- Processes each input item individually to maintain deterministic and isolated evaluations.
- Supports complex TLD structures including country-code and multi-part suffixes for accurate parsing.
Product Overview
This domain extraction and classification automation workflow accepts input in the form of either a URL or an email address. It begins with a trigger node that starts the process when invoked, followed by a data preparation node that sets the input field by selecting either the URL or email from the incoming JSON data. The core logic resides in a custom code node that executes JavaScript to cleanse and extract the domain.
The node maintains a comprehensive array of valid TLDs sourced from the public suffix list, enabling it to correctly identify domain boundaries even in cases of nested or multi-level TLDs such as “gov.au” or “co.uk”. It extracts the hostname from URLs by stripping protocols, ports, and query parameters, then determines the registered domain portion by matching against the TLD list. For email inputs, it extracts the domain substring following the “@” character.
Additionally, the workflow checks the extracted domain against a detailed free mail provider list, returning a boolean flag indicating if the domain is associated with free email services like Gmail or Yahoo. The processing is done synchronously within the code node, and results are appended to the item JSON for downstream consumption. Error handling defaults to platform behavior with continuation on failure enabled to avoid complete workflow interruption.
Features and Outcomes
Core Automation
This no-code integration pipeline processes inputs by first determining if the data is a URL or email, then applies domain extraction heuristics using an up-to-date TLD list. The domain extraction logic runs once per item, ensuring isolated handling of each input.
- Single-pass evaluation of each input for domain extraction and free mail identification.
- Comprehensive TLD matching enables accurate parsing of complex domain suffixes.
- Boolean flag output distinguishes free mail provider domains from others.
Integrations and Intake
The workflow integrates with the n8n platform’s execute workflow trigger node for initiation and uses a set node to prepare inputs. It accepts JSON objects containing either a “url” or “email” field.
- Execute Workflow Trigger node starts the automation on demand or external call.
- Set node organizes input data to a standardized “input” field for processing.
- Code node runs JavaScript to parse domain and check against free mail domains.
Outputs and Consumption
The workflow produces a structured JSON output including the original input, the extracted domain, and a boolean indicating free mail provider status. This synchronous output allows immediate downstream use in enrichment or filtering pipelines.
- Output fields: “input” (original string), “domain” (parsed domain or null), “free_mail_provider” (true/false).
- JSON format suitable for integration with data validation or CRM enrichment systems.
- Supports both URL and email input types with consistent output schema.
Workflow — End-to-End Execution
Step 1: Trigger
The workflow is initiated by the “Execute Workflow Trigger” node, which can be called manually or via an external automation call. It requires no special headers or parameters beyond the presence of either a “url” or “email” field in JSON input.
Step 2: Processing
The “Prepare data before function” set node consolidates the input by extracting either the URL or email field from incoming data and assigning it to a single “input” attribute. This ensures a consistent input structure for the domain extraction logic. No additional validation schema is applied, relying on basic presence checks.
Step 3: Analysis
The “Extract domain” code node executes JavaScript that first determines whether the input is an email or URL. For URLs, it extracts the hostname by removing protocol, port, and query parameters, then identifies the registered domain by matching against a comprehensive TLD list. For emails, it extracts the domain part after the “@” symbol. It also checks if the domain matches any in the free mail provider list, setting a boolean flag accordingly.
Step 4: Delivery
The node outputs a JSON object synchronously, augmenting the workflow item with fields “input”, “domain”, and “free_mail_provider”. This output can be consumed immediately by downstream workflows or external systems for further processing or decision-making.
Use Cases
Scenario 1
Data teams require normalization of user input containing URLs or emails for customer segmentation. This automation workflow extracts the registered domain and identifies free mail providers, enabling deterministic categorization without manual domain parsing.
Scenario 2
Marketing platforms ingest contact lists with mixed URL and email formats. Using this domain extraction pipeline, they can cleanse and enrich records by reliably extracting domains and flagging free email service users, supporting targeted campaign logic.
Scenario 3
Security analysts need to filter inbound traffic or email domains to identify external free mail sources potentially indicating phishing attempts. This automation workflow provides structured domain extraction and classification in a single response cycle for real-time processing.
How to use
Integrate this domain extraction workflow into your n8n environment by importing the workflow JSON. Configure the execute workflow trigger node as the entry point. Provide input JSON objects containing either a “url” or “email” field to the workflow trigger. The workflow processes each input individually, extracting the domain and determining free mail status.
Results are returned synchronously within the workflow output, including the original input, extracted domain, and a boolean flag. Use these outputs in downstream automation steps for data cleansing, filtering, or enrichment tasks.
Comparison — Manual Process vs. Automation Workflow
| Attribute | Manual/Alternative | This Workflow |
|---|---|---|
| Steps required | Multiple manual steps including URL cleansing and TLD validation. | Single automated pipeline with integrated domain extraction and classification. |
| Consistency | Variable results due to human error and inconsistent TLD knowledge. | Deterministic extraction using an authoritative TLD list ensures uniformity. |
| Scalability | Limited by manual processing capacity and complexity of TLDs. | Scales automatically to process large datasets with item-wise evaluation. |
| Maintenance | Requires frequent updates to TLD and free mail provider lists manually. | TLD and provider lists embedded in code node, updateable through workflow edits. |
Technical Specifications
| Environment | n8n automation platform |
|---|---|
| Tools / APIs | Execute Workflow Trigger, Set, and Code nodes |
| Execution Model | Synchronous processing within a single workflow run |
| Input Formats | JSON with “url” or “email” string fields |
| Output Formats | JSON fields: input, domain, free_mail_provider (boolean) |
| Data Handling | Transient processing with no persistence beyond workflow run |
| Known Constraints | Relies on embedded TLD and free mail provider lists for accuracy |
| Credentials | None required for domain extraction logic |
Implementation Requirements
- Provision of input JSON containing either a “url” or “email” field.
- n8n environment capable of running Execute Workflow Trigger, Set, and Code nodes.
- Maintenance access to update TLD and free mail provider arrays as needed.
Configuration & Validation
- Import the workflow JSON into your n8n instance and verify nodes are connected as per design.
- Test the workflow by triggering it with sample inputs containing URLs or emails.
- Confirm that outputs include correctly extracted domains and accurate free mail provider flags.
Data Provenance
- Trigger: Execute Workflow Trigger node initiates domain extraction.
- Processing: Set node standardizes input field to “input”.
- Extraction: Code node “Extract domain” runs JavaScript for domain parsing and classification using embedded TLD and free mail provider lists.
FAQ
How is the domain extraction automation workflow triggered?
The workflow is triggered via the Execute Workflow Trigger node, accepting JSON input with either a “url” or “email” field to start processing.
Which tools or models does the orchestration pipeline use?
The pipeline uses n8n’s code node to run custom JavaScript logic, referencing embedded arrays of valid TLDs and free mail providers for domain extraction and classification.
What does the response look like for client consumption?
The workflow returns a JSON object with fields “input” (original input), “domain” (extracted domain or null), and “free_mail_provider” (boolean indicating free mail status).
Is any data persisted by the workflow?
No data is persisted; processing is transient and results are available only during the workflow execution context.
How are errors handled in this integration flow?
Errors do not halt the workflow due to “continueOnFail” enabled; the platform’s default error handling applies, ensuring robustness.
Conclusion
This domain extraction automation workflow delivers deterministic and precise parsing of registered domains from URLs or emails, supported by a comprehensive TLD list. It reliably identifies free mail provider domains, facilitating classification and downstream processing. The workflow operates synchronously within the n8n platform, requiring no external credentials but dependent on the embedded TLD and provider data for accuracy. Its design supports scalable, consistent domain parsing without manual intervention, suitable for various data cleansing and enrichment applications.








Reviews
There are no reviews yet.