Link and Email Extractor
Paste any text or HTML and extract every URL and email address in one pass. Dedupes, sorts, and counts unique hosts — perfect for auditing content exports or comment dumps.
URLs found
0
Emails found
0
Unique hosts
0
(none)
Use this with
Related dev utilities tools
Extraction Guide
From messy text to clean list in one paste
Whether you're auditing a content export, cleaning up a forum dump, or extracting contacts from a CSV you can't open — manually finding every URL and email is tedious. This tool does it in one paste with dedupe and sort options.
HTML-aware extraction
We strip HTML tags before scanning, so URLs inside href="..." attributes get caught too. Works equally well on plain text, HTML source, Markdown, and mixed content.
Dedupe and sort
Toggle dedupe to get only unique URLs/emails. Toggle sort for alphabetical output — great for copy-paste into lists where order matters.
URLs only, emails only, or both
Filter the output based on what you need. URL-only mode is common for backlink audits; email-only for outreach list building; both for comprehensive content audit.
Unique hosts
We count distinct hostnames across all extracted URLs — a quick proxy for how many different sites are linked. Useful for backlink quality audits.
Trailing punctuation handling
URLs followed by punctuation (sentence end) get cleaned: 'visit example.com.' extracts 'example.com' not 'example.com.'. Small detail, big time saver when pasting prose.
Email case normalization
Emails are stored lowercase in the dedupe set — "Bob@Example.com" and "bob@example.com" collapse to one entry. Matches how email systems actually route.
Pro Tips
Email-only mode + dedupe + sort = clean outreach list. Paste your CSV blob, extract emails, copy the clean list. Takes 3 seconds instead of 10 minutes.
Paste the HTML of a competitor-linking-to-you page. Filter URLs only. Count unique hosts. You now know how many external sites that page links to.
This tool runs entirely in your browser — nothing uploads. Safe for extracting from customer exports, internal docs, or anything you wouldn't paste into a third-party site.
Frequently Asked Questions
- Does it find tel: or mailto: prefixed URLs?
- Emails are caught directly (mailto: prefix optional). tel: links aren't extracted — use Find & Replace with a regex if you need them separately.
- What about international (non-ASCII) domains?
- URLs with IDN (internationalized domain names) usually appear Punycode-encoded in HTML (xn--). Our extractor catches those. Raw Unicode domains are caught only if the surrounding text matches the https:// pattern.
- Can I extract social handles (@username)?
- Not with this tool — @username patterns would conflict with email detection. Use Find & Replace with a regex like /@\w+/g to extract social handles.
- Is it accurate for really messy text?
- Yes — the regex is permissive enough to catch URLs in most contexts but strict enough to reject random strings. For pathological cases (e.g., URLs split across lines), you may need to clean the input first with Text Cleaner.