Does it find tel: or mailto: prefixed URLs?

Emails are caught directly (mailto: prefix optional). tel: links aren't extracted — use Find & Replace with a regex if you need them separately.

What about international (non-ASCII) domains?

URLs with IDN (internationalized domain names) usually appear Punycode-encoded in HTML (xn--). Our extractor catches those. Raw Unicode domains are caught only if the surrounding text matches the https:// pattern.

Can I extract social handles (@username)?

Not with this tool — @username patterns would conflict with email detection. Use Find & Replace with a regex like /@\w+/g to extract social handles.

Is it accurate for really messy text?

Yes — the regex is permissive enough to catch URLs in most contexts but strict enough to reject random strings. For pathological cases (e.g., URLs split across lines), you may need to clean the input first with Text Cleaner.

Technical SEO

Link and Email Extractor

Paste any text or HTML and extract every URL and email address in one pass. Dedupes, sorts, and counts unique hosts — perfect for auditing content exports or comment dumps.

InputText or HTML· Anything containing URLs or emails

DeduplicateSort alphabetically

OutputExtracted links + emails· Awaiting input

URLs found

Emails found

Unique hosts

(none)

Use this with

See all 8 tools

HTML Minifier

Strip whitespace & comments

CSS Minifier

Remove whitespace & comments from CSS

JSON Formatter

Format, validate & minify JSON

Extraction Guide

From messy text to clean list in one paste

Whether you're auditing a content export, cleaning up a forum dump, or extracting contacts from a CSV you can't open — manually finding every URL and email is tedious. This tool does it in one paste with dedupe and sort options.

HTML-aware extraction

We strip HTML tags before scanning, so URLs inside href="..." attributes get caught too. Works equally well on plain text, HTML source, Markdown, and mixed content.

Dedupe and sort

Toggle dedupe to get only unique URLs/emails. Toggle sort for alphabetical output — great for copy-paste into lists where order matters.

URLs only, emails only, or both

Filter the output based on what you need. URL-only mode is common for backlink audits; email-only for outreach list building; both for comprehensive content audit.

Unique hosts

We count distinct hostnames across all extracted URLs — a quick proxy for how many different sites are linked. Useful for backlink quality audits.

Trailing punctuation handling

URLs followed by punctuation (sentence end) get cleaned: 'visit example.com.' extracts 'example.com' not 'example.com.'. Small detail, big time saver when pasting prose.

Email case normalization

Emails are stored lowercase in the dedupe set — "Bob@Example.com" and "bob@example.com" collapse to one entry. Matches how email systems actually route.

Pro Tips

Strip before sending outreach

Email-only mode + dedupe + sort = clean outreach list. Paste your CSV blob, extract emails, copy the clean list. Takes 3 seconds instead of 10 minutes.

Backlink audit trick

Paste the HTML of a competitor-linking-to-you page. Filter URLs only. Count unique hosts. You now know how many external sites that page links to.

Privacy reminder

This tool runs entirely in your browser — nothing uploads. Safe for extracting from customer exports, internal docs, or anything you wouldn't paste into a third-party site.

Frequently Asked Questions

Does it find tel: or mailto: prefixed URLs?: Emails are caught directly (mailto: prefix optional). tel: links aren't extracted — use Find & Replace with a regex if you need them separately.
What about international (non-ASCII) domains?: URLs with IDN (internationalized domain names) usually appear Punycode-encoded in HTML (xn--). Our extractor catches those. Raw Unicode domains are caught only if the surrounding text matches the https:// pattern.
Can I extract social handles (@username)?: Not with this tool — @username patterns would conflict with email detection. Use Find & Replace with a regex like /@\w+/g to extract social handles.
Is it accurate for really messy text?: Yes — the regex is permissive enough to catch URLs in most contexts but strict enough to reject random strings. For pathological cases (e.g., URLs split across lines), you may need to clean the input first with Text Cleaner.

Link and Email Extractor

Related dev utilities tools

HTML Minifier

CSS Minifier

JSON Formatter

From messy text to clean list in one paste

HTML-aware extraction

Dedupe and sort

URLs only, emails only, or both

Unique hosts

Trailing punctuation handling

Email case normalization

Pro Tips

Frequently Asked Questions