Representing Characters in HTML

Numeric Character References (NCRs), Character Entity References

There are several ways to represent a character in HTML, some right, some wrong. Here is some guidance and tables showing how characters can be represented.

I18nGuy Home Page

Direct Character Entry

Of course, you can enter the character directly as a binary value using the character encoding of the file. For example, if you type the letter "a", the binary value 9710 (6116) is entered into the editor's buffer (assuming an ASCII-based character encoding).

Numeric Character Reference (NCR)

You can also represent a character using a Numeric Character Reference, of the form &#dddd;, where dddd is the decimal value representing the character's Unicode scalar value. You can alternatively use a hexadecimal representation &#xhhhh;, where hhhh is the hexadecimal value equivalent to the decimal value. For example, the character Yen, has a decimal value of 165 in Unicode, a hexadecimal value of A5, and can therefore be represented as ¥ or ¥.

For the values 160-25510, the Unicode scalar values are identical to the code points for the same characters in ISO 8859-1 and Windows-1252. However, it is important to remember, that the Numeric Character References are defined in terms of Unicode and not any other character encoding.

In the code point range 128-15910, Windows-1252 assigns characters, whereas ISO-8859-1 assigns control codes. Unicode assigns the same control codes to that range. The characters in that range in Windows-1252 have very different code points in Unicode.

For example, the character for the euro currency symbol () has the code point 12810 (8016) in the Windows-1252 code page. In Unicode, the euro is represented by the Unicode scalar value 20AC16.

Review the Table 128-159 below to see the characters in Windows-1252 that are affected and require values in Numeric Character References that are different from their Windows-1252 code points.

Character Entity References

HTML also defines Character Entity References, i.e. short length text names that can be used to identify a character. For example, to identify the Yen character, you can use ¥. The names are listed below. To use the name as a Character Entity Reference, prepend the ampersand "&" and append the semi-colon ";". The euro can be represented by €.

Index of tables
Table Special
This table lists characters that are significant to the syntax of HTML or are useful to control Unicode text.
The columns are: the name used in Character Entity Reference, the decimal Numeric Character Reference, the character glyph, and the character description.
Table 128-159
This table lists the names used in Character Entity References, the decimal Numeric Character Reference, depicts the character, and describes the character, for each character in the range 128-159.
Table 160-255 (A0-FF16)
This table lists the names used in Character Entity References, the decimal Numeric Character Reference, the hexadecimal Numerical Character Reference depicts the character, and describes the character, for each character in the range 160-25510 (A0-FF16).
Table Other
This table lists the remaining characters that have Character Entity References defined in HTML 4.
The columns are: the name used in Character Entity Reference, the decimal Numeric Character Reference, the character glyph, and the character description.
Table Special
These characters are significant to HTML syntax or to Unicode.
Entity
Character
Reference
(decimal)
Numeric
Character
Reference
Character
Name
""quotation mark = APL quote
&&ampersand
<<less-than sign
>>greater-than sign
  en space
  em space
  thin space
‌‌zero width non-joiner
‍‍zero width joiner
‎‎left-to-right mark
‏‏right-to-left mark
Table 128-159
The Unicode based NCRs and Character Entity References are given for the characters in the range 128-159 in Windows 1252.
Windows
cp1252
code
point
(Decimal)
Numeric
Character
Reference
Character Character
Entity
Reference
or hex
NCR
Character
Name
128€€EURO SIGN
129   unassigned
130‚‚SINGLE LOW-9 QUOTATION MARK
131ƒƒƒLATIN SMALL LETTER F WITH HOOK
132„„ DOUBLE LOW-9 QUOTATION MARK
133……HORIZONTAL ELLIPSIS
134††DAGGER
135‡‡DOUBLE DAGGER
136ˆˆˆMODIFIER LETTER CIRCUMFLEX ACCENT
137‰‰PER MILLE SIGN
138ŠŠŠLATIN CAPITAL LETTER S WITH CARON
139‹‹SINGLE LEFT-POINTING ANGLE QUOTATION MARK
140ŒŒŒ LATIN CAPITAL LIGATURE OE
141   unassigned
142ŽŽŽLATIN CAPITAL LETTER Z WITH CARON
143   unassigned
144   unassigned
145‘‘ LEFT SINGLE QUOTATION MARK
146’’ RIGHT SINGLE QUOTATION MARK
147““ LEFT DOUBLE QUOTATION MARK
148”” RIGHT DOUBLE QUOTATION MARK
149•• BULLET
150–– EN DASH
151—— EM DASH
152˜˜˜ SMALL TILDE
153™™ TRADE MARK SIGN
154ššš LATIN SMALL LETTER S WITH CARON
155››SINGLE RIGHT-POINTING ANGLE QUOTATION MARK
156œœœ LATIN SMALL LIGATURE OE
158žžž LATIN SMALL LETTER Z WITH CARON
159ŸŸŸ LATIN CAPITAL LETTER Y WITH DIAERESIS
  Top of page
Table 160-255
These are the characters in the range 160-25510 (A0-FF16).
Character
Entity
Reference
(Decimal)
Numeric
Character
Reference
(Hexadecimal)
Numeric
Character
Reference
Character Character
Name
nbsp       no-break space
iexcl ¡ ¡ ¡ inverted exclamation mark
cent ¢ ¢ ¢ cent sign
pound £ £ £ pound sterling sign
curren ¤ ¤ ¤ general currency sign
yen ¥ ¥ ¥ yen sign
brvbar ¦ ¦ ¦ broken (vertical) bar
sect § § § section sign
uml ¨ ¨ ¨ umlaut (dieresis)
copy © © © copyright sign
ordf ª ª ª ordinal indicator, feminine
laquo « « « angle quotation mark, left
not ¬ ¬ ¬ not sign
shy ­ ­ ­ soft hyphen
reg ® ® ® registered sign
macr ¯ ¯ ¯ macron
deg ° ° ° degree sign
plusmn ± ± ± plus-or-minus sign
sup2 ² ² ² superscript two
sup3 ³ ³ ³ superscript three
acute ´ ´ ´ acute accent
micro µ µ µ micro sign
para ¶ ¶ pilcrow (paragraph sign)
middot · · · middle dot
cedil ¸ ¸ ¸ cedilla
sup1 ¹ ¹ ¹ superscript one
ordm º º º ordinal indicator, masculine
raquo » » » angle quotation mark, right
frac14 ¼ ¼ ¼ fraction one-quarter
frac12 ½ ½ ½ fraction one-half
frac34 ¾ ¾ ¾ fraction three-quarters
iquest ¿ ¿ ¿ inverted question mark
Agrave À À À capital A, grave accent
Aacute Á Á Á capital A, acute accent
Acirc    capital A, circumflex accent
Atilde à à à capital A, tilde
Auml Ä Ä Ä capital A, dieresis or umlaut mark
Aring Å Å Å capital A, ring
AElig Æ Æ Æ capital AE diphthong (ligature)
Ccedil Ç Ç Ç capital C, cedilla
Egrave È È È capital E, grave accent
Eacute É É É capital E, acute accent
Ecirc Ê Ê Ê capital E, circumflex accent
Euml Ë Ë Ë capital E, dieresis or umlaut mark
Igrave Ì Ì Ì capital I, grave accent
Iacute Í Í Í capital I, acute accent
Icirc Î Î Î capital I, circumflex accent
Iuml Ï Ï Ï capital I, dieresis or umlaut mark
ETH Ð Ð Ð capital Eth, Icelandic
Ntilde Ñ Ñ Ñ capital N, tilde
Ograve Ò Ò Ò capital O, grave accent
Oacute Ó Ó Ó capital O, acute accent
Ocirc Ô Ô Ô capital O, circumflex accent
Otilde Õ Õ Õ capital O, tilde
Ouml Ö Ö Ö capital O, dieresis or umlaut mark
times × × × multiply sign
Oslash Ø Ø Ø capital O, slash
Ugrave Ù Ù Ù capital U, grave accent
Uacute Ú Ú Ú capital U, acute accent
Ucirc Û Û Û capital U, circumflex accent
Uuml Ü Ü Ü capital U, dieresis or umlaut mark
Yacute Ý Ý Ý capital Y, acute accent
THORN Þ Þ Þ capital THORN, Icelandic
szlig ß ß ß small sharp s, German (sz ligature)
agrave à à à small a, grave accent
aacute á á á small a, acute accent
acirc â â â small a, circumflex accent
atilde ã ã ã small a, tilde
auml ä ä ä small a, dieresis or umlaut mark
aring å å å small a, ring
aelig æ æ æ small ae diphthong (ligature)
ccedil ç ç ç small c, cedilla
egrave è è è small e, grave accent
eacute é é é small e, acute accent
ecirc ê ê ê small e, circumflex accent
euml ë ë ë small e, dieresis or umlaut mark
igrave ì ì ì small i, grave accent
iacute í í í small i, acute accent
icirc î î î small i, circumflex accent
iuml ï ï ï small i, dieresis or umlaut mark
eth ð ð ð small eth, Icelandic
ntilde ñ ñ ñ small n, tilde
ograve ò ò ò small o, grave accent
oacute ó ó ó small o, acute accent
ocirc ô ô ô small o, circumflex accent
otilde õ õ õ small o, tilde
ouml ö ö ö small o, dieresis or umlaut mark
divide ÷ ÷ ÷ divide sign
oslash ø ø ø small o, slash
ugrave ù ù ù small u, grave accent
uacute ú ú ú small u, acute accent
ucirc û û û small u, circumflex accent
uuml ü ü ü small u, dieresis or umlaut mark
yacute ý ý ý small y, acute accent
thorn þ þ þ small thorn, Icelandic
yuml ÿ ÿ ÿ small y, dieresis or umlaut mark
  Top of page
Table Other
These are the remaining characters with Character Entity References defined in HTML 4.
Character
Entity
Reference
(Decimal)
Numeric
Character
Reference
Character Character
Name
fnofƒƒlatin small f with hook, U+0192
(also used to represent the former Dutch currency Florin, Guilder or Gulden)
AlphaΑΑgreek capital letter alpha, U+0391
BetaΒΒgreek capital letter beta, U+0392
GammaΓΓgreek capital letter gamma, U+0393
DeltaΔΔgreek capital letter delta, U+0394
EpsilonΕΕgreek capital letter epsilon, U+0395
ZetaΖΖgreek capital letter zeta, U+0396
EtaΗΗgreek capital letter eta, U+0397
ThetaΘΘgreek capital letter theta, U+0398
IotaΙΙgreek capital letter iota, U+0399
KappaΚΚgreek capital letter kappa, U+039A
LambdaΛΛgreek capital letter lambda, U+039B
MuΜΜgreek capital letter mu, U+039C
NuΝΝgreek capital letter nu, U+039D
XiΞΞgreek capital letter xi, U+039E
OmicronΟΟgreek capital letter omicron, U+039F
PiΠΠgreek capital letter pi, U+03A0
RhoΡΡgreek capital letter rho, U+03A1
SigmaΣΣgreek capital letter sigma, U+03A3
TauΤΤgreek capital letter tau, U+03A4
UpsilonΥΥgreek capital letter upsilon, U+03A5
PhiΦΦgreek capital letter phi, U+03A6
ChiΧΧgreek capital letter chi, U+03A7
PsiΨΨgreek capital letter psi, U+03A8
OmegaΩΩgreek capital letter omega, U+03A9
alphaααgreek small letter alpha, U+03B1
betaββgreek small letter beta, U+03B2
gammaγγgreek small letter gamma, U+03B3
deltaδδgreek small letter delta, U+03B4
epsilonεεgreek small letter epsilon, U+03B5
zetaζζgreek small letter zeta, U+03B6
etaηηgreek small letter eta, U+03B7
thetaθθgreek small letter theta, U+03B8
iotaιιgreek small letter iota, U+03B9
kappaκκgreek small letter kappa, U+03BA
lambdaλλgreek small letter lambda, U+03BB
muμμgreek small letter mu, U+03BC
nuννgreek small letter nu, U+03BD
xiξξgreek small letter xi, U+03BE
omicronοοgreek small letter omicron, U+03BF
piππgreek small letter pi, U+03C0
rhoρρgreek small letter rho, U+03C1
sigmafςςgreek small letter final sigma, U+03C2
sigmaσσgreek small letter sigma, U+03C3
tauττgreek small letter tau, U+03C4
upsilonυυgreek small letter upsilon, U+03C5
phiφφgreek small letter phi, U+03C6
chiχχgreek small letter chi, U+03C7
psiψψgreek small letter psi, U+03C8
omegaωωgreek small letter omega, U+03C9
thetasymϑϑgreek small letter theta symbol, U+03D1
upsihϒϒgreek upsilon with hook symbol, U+03D2
pivϖϖgreek pi symbol, U+03D6
bull•bullet = black small circle, U+2022
hellip…horizontal ellipsis = three dot leader, U+2026
prime′prime = minutes = feet, U+2032
Prime″double prime = seconds = inches, U+2033
oline‾overline = spacing overscore, U+203E
frasl⁄fraction slash, U+2044
weierp℘script capital P = power set = Weierstrass p, U+2118
imageℑblackletter capital I = imaginary part, U+2111
realℜblackletter capital R = real part symbol, U+211C
trade™trade mark sign, U+2122
alefsymℵalef symbol = first transfinite cardinal, U+2135
larr←leftwards arrow, U+2190
uarr↑upwards arrow, U+2191
rarr→rightwards arrow, U+2192
darr↓downwards arrow, U+2193
harr↔left right arrow, U+2194
crarr↵downwards arrow with corner leftwards = carriage return, U+21B5
lArr⇐leftwards double arrow, U+21D0
uArr⇑upwards double arrow, U+21D1
rArr⇒rightwards double arrow, U+21D2
dArr⇓downwards double arrow, U+21D3
hArr⇔left right double arrow, U+21D4
forall∀for all, U+2200
part∂partial differential, U+2202
exist∃there exists, U+2203
empty∅empty set = null set = diameter, U+2205
nabla∇nabla = backward difference, U+2207
isin∈element of, U+2208
notin∉not an element of, U+2209
ni∋contains as member, U+220B
prod∏n-ary product = product sign, U+220F
sum∑n-ary sumation, U+2211
minus−minus sign, U+2212
lowast∗asterisk operator, U+2217
radic√square root = radical sign, U+221A
prop∝proportional to, U+221D
infin∞infinity, U+221E
ang∠angle, U+2220
and∧logical and = wedge, U+2227
or∨logical or = vee, U+2228
cap∩intersection = cap, U+2229
cup∪union = cup, U+222A
int∫integral, U+222B
there4∴therefore, U+2234
sim∼tilde operator = varies with = similar to, U+223C
cong≅approximately equal to, U+2245
asymp≈almost equal to = asymptotic to, U+2248
ne≠not equal to, U+2260
equiv≡identical to, U+2261
le≤less-than or equal to, U+2264
ge≥greater-than or equal to, U+2265
sub⊂subset of, U+2282
sup⊃superset of, U+2283
nsub⊄not a subset of, U+2284
sube⊆subset of or equal to, U+2286
supe⊇superset of or equal to, U+2287
oplus⊕circled plus = direct sum, U+2295
otimes⊗circled times = vector product, U+2297
perp⊥up tack = orthogonal to = perpendicular, U+22A5
sdot⋅dot operator, U+22C5
lceil⌈left ceiling = apl upstile, U+2308
rceil⌉right ceiling, U+2309
lfloor⌊left floor = apl downstile, U+230A
rfloor⌋right floor, U+230B
lang〈left-pointing angle bracket = bra, U+2329
rang〉right-pointing angle bracket = ket, U+232A
loz◊lozenge, U+25CA
spades♠black spade suit, U+2660
clubs♣black club suit = shamrock, U+2663
hearts♥black heart suit = valentine, U+2665
diams♦black diamond suit, U+2666
Top of page