How to inspect text with this tool
Paste any text into the input. The tool shows a summary — how many characters (codepoints), how the count differs from the UTF-16 length, how many bytes it is in UTF-8, and how many invisible or non-ASCII characters it contains — followed by a per-character table. Each row gives the glyph, the Unicode codepoint in U+ and decimal form, a name or category, the UTF-8 byte sequence, the UTF-16 code units, and ready-to-paste JavaScript and HTML escapes. Rows that are invisible or confusable are highlighted. Use Copy cleaned text to get the same text with the invisible characters removed. Nothing is uploaded.
Hidden and invisible characters
Plenty of valid Unicode characters render as nothing at all. Zero-width characters — the zero-width space (U+200B), zero-width joiner and non-joiner (U+200D, U+200C), and the word joiner (U+2060) — take up no visible room. The byte-order mark (U+FEFF) sneaks in at the start of files. Bidirectional controls (U+202A–U+202E and the isolates U+2066–U+2069) can reorder how text displays without changing its stored order, the basis of the "Trojan Source" attack. Variation selectors (U+FE00–U+FE0F) and Unicode tag characters (U+E0000–U+E007F) are invisible too and can be packed between visible letters to hide data. This tool surfaces all of them, so text that looks ordinary but carries hidden payload is exposed at a glance.
AI watermarks and tracked text
Invisible characters are increasingly used to watermark or fingerprint text — including some AI-generated output and "leaked document" traps — by inserting zero-width or tag characters in a pattern between the visible words. Because these marks are nothing more than hidden codepoints, an inspector that lists every character makes them visible, and stripping them returns clean text. Paste a suspect message or a copied AI response, check the flagged rows, and copy the cleaned version to remove the hidden marks. No detector can promise to catch every conceivable scheme, but the common watermarking characters are exactly the invisible codepoints this tool flags.
Confusable homoglyphs
Some characters are visible but deceptive: the Cyrillic а (U+0430) and Latin a (U+0061) look identical, as do Greek and full-width lookalikes. These confusables are how spoofed domain names and impersonating usernames are built, and how code can hide an identifier that is not what it appears. The inspector flags common homoglyphs and shows the ASCII letter they imitate, so a string that reads as plain English but is secretly mixed-script gives itself away.
Why a local Unicode inspector matters
The text you check for hidden characters is often exactly the text you would not want to upload — a private message you suspect is tracked, an internal document, a prompt, or source code. Sending it to an online analyzer to "find the invisible characters" hands the whole thing to a third-party server. The point of the check is privacy, so the check itself should be private.
This tool is plain JavaScript that runs inside your own browser tab. Codepoint iteration, the invisible-character scan and the homoglyph check all happen in memory on your device, with no network request, no account and no logging. Close the tab and the text is gone. That is the gitime.dev approach across the board: deterministic, dependency-light developer tools that keep your data where it belongs.
- Per-character codepoint, UTF-8/16 bytes and escapes.
- Detects zero-width, bidi, tag and variation-selector characters.
- Flags confusable homoglyphs and unusual spaces.
- Copy cleaned text with invisible marks removed.
- Everything stays on your device — no upload, no logging.
Frequently asked questions
- Is my text uploaded anywhere?
- No. Analysis runs in your browser. Pasted text never leaves your device.
- Can it find hidden or invisible characters?
- Yes — zero-width characters, the BOM, bidi controls, variation selectors, tag characters and unusual spaces, with a one-click cleaned copy.
- Does it detect AI watermark characters?
- It flags the invisible codepoints commonly used as watermarks; listing and stripping them reveals and removes them, though no tool catches every scheme.
- What does the per-character table show?
- Glyph, codepoint (U+ and decimal), name/category, UTF-8 bytes, UTF-16 units and JavaScript/HTML escapes, with flagged rows for invisible or confusable characters.