Which tokenizer does this use?

It uses OpenAI's tiktoken byte-pair encodings, bundled and run in your browser: o200k_base for GPT-4o, GPT-4.1 and the o-series, and cl100k_base for GPT-4, GPT-4 Turbo, GPT-3.5 Turbo and the text-embedding-3 models. The counts are exact for those models.

Is my prompt sent anywhere?

No. The encoding tables are downloaded once with the page and tokenization runs as JavaScript in your browser. Your text is never uploaded, so it is safe for proprietary prompts and private data.

Are the prices accurate?

The price per million tokens is a reference default that you can edit. Provider pricing changes over time, so always confirm the current rate with your API provider before relying on a cost estimate.

Why does the count differ from another tool?

Token counts depend on the model's encoding. A GPT-4o (o200k_base) count differs from a GPT-3.5 (cl100k_base) count for the same text. Chat messages also add a few overhead tokens per message that a raw text count does not include.

LLM Token Counter & Cost Estimator

On this page

How to use it
What a token is
Models & encodings
Estimating cost
Why count locally
FAQ

How to count tokens

Paste or type your text into the box and the token count updates instantly — there is no button to press and nothing to submit. Pick the model you are targeting from the dropdown: the tool automatically switches to the correct encoding, sets the right context-window size, and pre-fills a reference price you can edit. The result is the same number the model's own tokenizer would produce, because it uses the identical byte-pair encoding tables, run as JavaScript inside your browser. It is built for the everyday questions prompt engineers actually ask: will this prompt fit in the window, how close am I to the limit, and roughly what will a thousand calls cost.

What a token actually is

Large language models do not read characters or words — they read tokens, the sub-word units produced by a byte-pair-encoding (BPE) tokenizer. A common English word is usually a single token, a rare word is split into several, and whitespace and punctuation are folded into the tokens around them. As a rough guide, one token is about four characters of English text, or roughly three-quarters of a word, but the only way to know the real figure is to run the actual tokenizer — which is exactly what this tool does. Counting tokens matters because every model has a fixed context window measured in tokens, and every API call is billed per token of input and output.

Models and their encodings

OpenAI models share a small number of encodings. This tool bundles the two that cover the current line-up:

o200k_base — used by GPT-4o, GPT-4o mini, GPT-4.1 and the reasoning o-series (o1, o3, o4-mini). It is the newer, more efficient encoding, so the same text usually costs fewer tokens here than under the older one.
cl100k_base — used by GPT-4, GPT-4 Turbo, GPT-3.5 Turbo and the text-embedding-3 models.

Because the encoding differs, the same text produces a different token count on GPT-4o than on GPT-3.5 — switching the model dropdown re-counts immediately so you can compare. Counting raw text gives you the prompt body; a real chat request adds a few overhead tokens per message for the role and delimiters, so treat the figure as the content size rather than the exact wire cost of a multi-message conversation.

Estimating API cost

The cost estimate multiplies your token count by the editable price per million input tokens. Each model pre-fills a reference rate, but provider pricing changes and output tokens are billed separately and usually cost more, so the field is yours to override. The "per 1,000 calls" figure helps you reason about a feature at scale — a prompt that looks cheap per call can dominate a budget once it runs in a loop. Treat every number here as a planning estimate, not a billing authority: always confirm the live rate with your provider.

Why count tokens locally

Tokenization is pure, deterministic computation — the same input and encoding always yield the same count — which makes a real tokenizer far more trustworthy than asking an AI assistant to "guess how many tokens this is", which it routinely gets wrong. More importantly, prompts are often proprietary: system prompts, customer data, unreleased copy. Running the encoding in your browser means none of it is uploaded or logged, matching the gitime.dev default that your data stays on your device.

Exact — real tiktoken o200k_base and cl100k_base encodings.
Live — counts update as you type.
Context-aware — see how much of the window you fill.
Cost — editable price, per-call and per-1,000 estimates.
Local — prompts never leave the browser.

Frequently asked questions

Which tokenizer does this use?: Real tiktoken encodings bundled in-browser: o200k_base for GPT-4o / GPT-4.1 / o-series, cl100k_base for GPT-4 / GPT-3.5 / embeddings.
Is my prompt sent anywhere?: No. The encoding tables load with the page and tokenization runs locally, so prompts are never uploaded.
Are the prices accurate?: The rate is an editable reference default; provider pricing changes, so confirm the current price before relying on a cost.
Why does the count differ from another tool?: Counts depend on the model's encoding, and chat requests add a few overhead tokens per message that a raw text count omits.