Comparison of ARM processors

From Wikipedia, the free encyclopedia

This is a comparison of ARM instruction set architecture application processor cores designed by Arm Holdings (ARM Cortex-A) and 3rd parties. It does not include ARM Cortex-R, ARM Cortex-M, or legacy ARM cores.

ARMv7-A

Summarize
Perspective

This is a table comparing 32-bit central processing units that implement the ARMv7-A (A means Application[1]) instruction set architecture and mandatory or optional extensions of it, the last AArch32.

More information Core, Decode width ...
CoreDecode
width
Execution
ports
Pipeline
depth
Out-of-order executionFPUPipelined
VFP
FPU
registers
NEON
(SIMD)
big.LITTLE
role
Virtualization[2]Process
technology
L0
cache
L1
cache
L2
cache
Core
configurations
Speed
per
core
(DMIPS
/ MHz
)
ARM part number
(in the main ID register)
ARM Cortex-A5 18No VFPv4 (optional)16 × 64-bit64-bit wide (optional) No No 40/28 nm 4–64 KiB / core 1, 2, 4 1.57 0xC05
ARM Cortex-A7 25[3]8No VFPv4Yes16 × 64-bit64-bit wide LITTLE Yes[4] 40/28 nm 8–64 KiB / coreup to 1 MiB (optional) 1, 2, 4, 8 1.9 0xC07
ARM Cortex-A8 22[5]13No VFPv3No32 × 64-bit64-bit wide No No 65/55/45 nm 32 KiB + 32 KiB256 or 512 (typical) KiB 1 2.0 0xC08
ARM Cortex-A9 23[6]8–11[7]Yes VFPv3 (optional)Yes(16 or 32) × 64-bit64-bit wide (optional) Companion Core No[7] 65/45/40/32/28 nm 32 KiB + 32 KiB1 MiB 1, 2, 4 2.5 0xC09
ARM Cortex-A12 211Yes VFPv4Yes32 × 64-bit128-bit wide No[8] Yes 28 nm 32–64 KiB + 32 KiB256 KiB, to 8 MiB 1, 2, 4 3.0 0xC0D
ARM Cortex-A15 38[3]15/17-25Yes VFPv4Yes32 × 64-bit128-bit wide big Yes[9] 32/28/20 nm 32 KiB + 32 KiB per coreup to 4 MiB per cluster, up to 8 MiB per chip 2, 4, 8 (4×2) 3.5 to 4.01 0xC0F
ARM Cortex-A17 2[10]11+Yes VFPv4Yes32 × 64-bit128-bit wide big Yes 28 nm 32 KiB + 32 KiB per core256 KiB, up to 8 MiB up to 4 4.0 0xC0E
Qualcomm Scorpion 23[11]10Yes (FXU&LSU only)[12] VFPv3Yes128-bit wide No 65/45 nm 32 KiB + 32 KiB256 KiB (single-core)
512 KiB (dual-core)
1, 2 2.1 0x00F
Qualcomm Krait[13] 3711Yes VFPv4[14]Yes128-bit wide No 28 nm 4 KiB + 4 KiB direct mapped16 KiB + 16 KiB 4-way set associative1 MiB 8-way set associative (dual-core) / 2 MiB (quad-core) 2, 4 3.3 (Krait 200)
3.39 (Krait 300)
3.39 (Krait 400)
3.51 (Krait 450)
0x04D

0x06F
Swift 3512Yes VFPv4Yes32 × 64-bit128-bit wide No 32 nm 32 KiB + 32 KiB1 MiB 2 3.5 ?
Core Decode
width
Execution
ports
Pipeline
depth
Out-of-order execution FPU Pipelined
VFP
FPU
registers
NEON
(SIMD)
big.LITTLE
role
Virtualization[2] Process
technology
L0
cache
L1
cache
L2
cache
Core
configurations
Speed
per
core
(DMIPS
/ MHz
)
ARM part number
(in the main ID register)
Close

ARMv8-A

Summarize
Perspective

This is a table of 64/32-bit central processing units that implement the ARMv8-A instruction set architecture and mandatory or optional extensions of it. Most chips support the 32-bit ARMv7-A for legacy applications. All chips of this type have a floating-point unit (FPU) that is better than the one in older ARMv7-A and NEON (SIMD) chips. Some of these chips have coprocessors also include cores from the older 32-bit architecture (ARMv7). Some of the chips are SoCs and can combine both ARM Cortex-A53 and ARM Cortex-A57, such as the Samsung Exynos 7 Octa.

Company Core Released Revision Decode Pipeline
depth
Out-of-order
execution
Branch
prediction
big.LITTLE role Exec.
ports
SIMD Fab
(in nm)
Simult. MT L0 cache L1 cache
Instr + Data
(in KiB)
L2 cache L3 cache Core
configu-
rations
Speed per core (DMIPS/
MHz
[note 1])
Clock rate ARM part number (in the main ID register)
Have it Entries
ARM Cortex-A32 (32-bit)[15] 2017 ARMv8.0-A
(only 32-bit)
2-wide8No0 ?LITTLE ? ? 28[16] NoNo8–64 + 8–640–1 MiBNo1–4+2.3 ?0xD01
Cortex-A34 (64-bit)[17] 2019 ARMv8.0-A
(only 64-bit)
2-wide8No0 ?LITTLE ? ?  ?NoNo8–64 + 8–640–1 MiBNo1–4+ ? ?0xD02
Cortex-A35[18] 2017 ARMv8.0-A2-wide[19]8No0YesLITTLE ? ? 28 / 16 /
14 / 10
NoNo8–64 + 8–640 / 128 KiB–1 MiBNo1–4+1.7[20]-1.85 ?0xD04
Cortex-A53[21] 2014 ARMv8.0-A2-wide8No0Conditional+
Indirect branch
prediction
big/LITTLE2 ? 28 / 20 /
16 / 14 / 10
NoNo8–64 + 8–64128 KiB–2 MiBNo1–4+2.24[22] ?0xD03
Cortex-A55[23] 2017 ARMv8.2-A2-wide8No0big/LITTLE2 ? 28 / 20 /
16 / 14 / 12 / 10 / 5[24]
NoNo16–64 + 16–640–256 KiB/core0–4 MiB1–8+2.65[25] ? 0xD05
Cortex-A57[26] 2013 ARMv8.0-A3-wide15Yes
3-wide dispatch
 ? ?big8 ? 28 / 20 /
16[27] / 14
NoNo48 + 320.5–2 MiBNo1–4+4.1[20]-4.8 ?0xD07
Cortex-A65[28] 2019 ARMv8.2-A
(only 64-bit)
2-wide10-12Yes
4-wide dispatch
Two-level ?9  ? SMT2 No32–64 + 32–64 KiB0, 64–256 KiB0, 0.5–4 MiB1-8 ? ? 0xD06
Cortex-A65AE[29] 2019 ARMv8.2-A ? ?Yes Two-level ?2  ? SMT2 No32–64 + 32–64 KiB64–256 KiB0, 0.5–4 MiB1–8 ? ? 0xD43
Cortex-A72[30] 2015 ARMv8.0-A3-wide15 Yes
5-wide dispatch
Two-levelbig8 28 / 16 No No48 + 320.5–4 MiBNo1–4+4.7[22]-6.3[31] ? 0xD08
Cortex-A73[32] 2016 ARMv8.0-A2-wide11–12 Yes
4-wide dispatch
Two-levelbig7 28 / 16 / 10 No No64 + 32/641–8 MiBNo1–4+4.8[20]–8.5[31] ? 0xD09
Cortex-A75[23] 2017 ARMv8.2-A 3-wide 11–13 Yes
6-wide dispatch
Two-level big 8? 2*128b 28 / 16 / 10 No No 64 + 64 256–512 KiB/core 0–4 MiB 1–8+ 6.1[20]–9.5[31]  ? 0xD0A
Cortex-A76[33] 2018 ARMv8.2-A 4-wide 11–13 Yes
8-wide dispatch
128Two-levelbig8 2*128b 10 / 7 No No 64 + 64 256–512 KiB/core 1–4 MiB 1–4 6.4  ? 0xD0B
Cortex-A76AE[34] 2018 ARMv8.2-A  ?  ? Yes 128 Two-level big  ?  ? No No  ?  ?  ?  ?  ?  ? 0xD0E
Cortex-A77[35] 2019 ARMv8.2-A 4-wide 11–13 Yes
10-wide dispatch
160Two-levelbig 12 2*128b 7 No 1.5K entries 64 + 64 256–512 KiB/core 1–4 MiB 1–4 7.3[20][36]  ? 0xD0D
Cortex-A78[37][38] 2020 ARMv8.2-A 4-wide Yes 160 Yes big 13 2*128b No 1.5K entries 32/64 + 32/64 256–512 KiB/core 1–4 MiB 1–4 7.6-8.2  ? 0xD41
Cortex-X1[39] 2020 ARMv8.2-A 5-wide[39]  ? Yes 224 Yes big 15 4*128b No 3K entries 64 + 64 up to 1 MiB[39] up to 8 MiB[39] custom[39] 10-11  ? 0xD44
Apple Cyclone[40] 2013 ARMv8.0-A6-wide[41]16[41]Yes[41] 192YesNo9[41] 28[42] No No64 + 64[41]1 MiB[41]4 MiB[41]2[43] ?1.3–1.4 GHz
Typhoon 2014 ARMv8.0‑A6-wide[44]16[44]Yes[44] YesNo9 20 No No64 + 64[41]1 MiB[44]4 MiB[41]2, 3 (A8X) ?1.1–1.5 GHz
Twister 2015 ARMv8.0‑A6-wide[44]16[44]Yes[44] YesNo9 16 / 14 No No64 + 64[44]3 MiB[44]4 MiB[44]
No (A9X)
2 ?1.85–2.26 GHz
Hurricane 2016 ARMv8.0‑A 6-wide[45] 16 Yes "big" (In A10/A10X paired with "LITTLE" Zephyr
cores)
9 3*128b 16 (A10)
10 (A10X)
No No 64 + 64[46] 3 MiB[46] (A10)
8 MiB (A10X)
4 MiB[46] (A10)
No (A10X)
2x Hurricane (A10)
3x Hurricane (A10X)
 ? 2.34–2.36 GHz
Zephyr ARMv8.0‑A 3-wide 12 Yes LITTLE 5 16 (A10)
10 (A10X)
No No 32 + 32[47] 1 MiB 4 MiB[46] (A10)
No (A10X)
2x Zephyr (A10)
3x Zephyr (A10X)
 ? 1.09–1.3 GHz
Monsoon 2017 ARMv8.2‑A[48] 7-wide 16 Yes "big" (In Apple A11 paired with "LITTLE" Mistral
cores)
11 3*128b 10 No No 64 + 64[47] 8 MiB No 2x Monsoon  ? 2.39 GHz
Mistral ARMv8.2‑A[48] 3-wide 12 Yes LITTLE 5 10 No No 32 + 32[47] 1 MiB No Mistral  ? 1.19 GHz
Vortex 2018 ARMv8.3‑A[49] 7-wide 16 Yes "big" (In Apple A12/Apple A12X/Apple A12Z paired with "LITTLE" Tempest
cores)
11 3*128b 7 No No 128 + 128[47] 8 MiB No 2x Vortex (A12)
4x Vortex (A12X/A12Z)
 ? 2.49 GHz
Tempest ARMv8.3‑A[49] 3-wide 12 Yes LITTLE 5 7 No No 32 + 32[47] 2 MiB No 4x Tempest  ? 1.59 GHz
Lightning 2019 ARMv8.4‑A[50] 8-wide 16 Yes 560 "big" (In Apple A13 paired with "LITTLE" Thunder
cores)
11 3*128b 7 No No 128 + 128[51] 8 MiB No 2x Lightning  ? 2.65 GHz
Thunder ARMv8.4‑A[50] 3-wide 12 Yes LITTLE 5 7 No No 96 + 48[52] 4 MiB No 4x Thunder  ? 1.8 GHz
Firestorm 2020 ARMv8.4-A[53] 8-wide[54] Yes 630[55] "big" (In Apple A14 and Apple M1/M1 Pro/M1 Max/M1 Ultra paired with "LITTLE" Icestorm
cores)
14 4*128b 5 No 192 + 128 8 MiB (A14)
12 MiB (M1)
24 MiB (M1 Pro/M1 Max)
48 MiB (M1 Ultra)
No 2x Firestorm (A14)
4x Firestorm (M1)

6x or 8x Firestorm (M1 Pro)
8x Firestorm (M1 Max)
16x Firestorm (M1 Ultra)

 ? 3.0–3.23 GHz
Icestorm ARMv8.4-A[53] 4-wide Yes 110 LITTLE 7 2*128b 5 No 128 + 64 4 MiB
8 MiB (M1 Ultra)
No 4x Icestorm (A14/M1)
2x Icestorm (M1 Pro/Max)
4x Icestorm (M1 Ultra)
 ? 1.82–2.06 GHz
Avalanche 2021 ARMv8.6‑A[53] 8-wide Yes "big" (In Apple A15 and Apple M2/M2 Pro/M2 Max/M2 Ultra paired with "LITTLE" Blizzard
cores)
14 4*128b 5 No 192 + 128 12 MiB (A15)
16 MiB (M2)
32 MiB (M2 Pro/M2 Max)
64 MiB (M2 Ultra)
No 2x Avalanche (A15)
4x Avalanche (M2)
6x or 8x Avalanche (M2 Pro)

8x Avalanche (M2 Max)
16x Avalanche (M2 Ultra)

 ? 2.93–3.49 GHz
Blizzard ARMv8.6‑A[53] 4-wide Yes LITTLE 8 2*128b 5 No 128 + 64 4 MiB
8 MiB (M2 Ultra)
No 4x Blizzard  ? 2.02–2.42 GHz
Everest 2022 ARMv8.6‑A[53] 8-wide Yes "big" (In Apple A16 paired with "LITTLE" Sawtooth
cores)
14 4*128b 5 No 192 + 128 16 MiB No 2x Everest  ? 3.46 GHz
Sawtooth ARMv8.6‑A[53] 4-wide Yes LITTLE 8 2*128b 5 No 128 + 64 4 MiB No 4x Sawtooth  ? 2.02 GHz
Nvidia Denver[56][57] 2014 ARMv8‑A 2-wide hardware
decoder, up to
7-wide variable-
length VLIW
micro-ops
13 Not if the hardware
decoder is in use.
Can be provided
by dynamic software
translation into VLIW.
Direct+
Indirect branch
prediction
No 7 28 No No 128 + 64 2 MiB No 2  ?  ?
Denver 2[58] 2016 ARMv8‑A  ? 13 Not if the hardware
decoder is in use.
Can be provided
by dynamic software
translation into VLIW.
Direct+
Indirect branch
prediction
"Super" Nvidia's own implementation  ? 16 No No 128 + 64 2 MiB No 2 ?  ?
Carmel 2018 ARMv8.2‑A  ? Direct+
Indirect branch
prediction
 ? 12 No No 128 + 64 2 MiB (4 MiB @ 8 cores) 2 (+ 8) 6.5-7.4  ?
Cavium ThunderX[59][60] 2014 ARMv8-A2-wide9[60]Yes[59] Two-level ? 28 No No78 + 32[61][62]16 MiB[61][62]No8–16, 24–48 ? ?
ThunderX2
[63](ex. Broadcom Vulcan[64])
2018[65] ARMv8.1-A
[66]
4-wide
"4 μops"[67][68]
 ?Yes[69] Multi-level ? ? 16[70] SMT4 No32 + 32
(data 8-way)
256 KiB
per core[71]
1 MiB
per core[71]
16–32[71] ? ?
Marvell ThunderX3 2020[72] ARMv8.3+[72]8-wide ?Yes
4-wide dispatch
Multi-level ?7 7[72] SMT4[72]  ?64 + 32512 KiB
per core
90 MiB60 ? ?
Applied

Micro

Helix 2014 ? ? ? ?  ? ? ? 40 / 28 No No32 + 32 (per core;
write-through
w/parity)[73]
256 KiB shared
per core pair (with ECC)
1 MiB/core2, 4, 8 ? ?
X-Gene 2013  ?4-wide15Yes  ? ? ? 40[74] No No8 MiB84.2 ?
X-Gene 2 2015  ?4-wide15Yes  ? ? ? 28[75] No No8 MiB84.2 ?
X-Gene 3[75] 2017  ? ? ? ?  ? ? ? 16 No No ? ?32 MiB32 ? ?
Qualcomm Kryo 2015 ARMv8-A ? ?Yes Two-level?"big" or "LITTLE"
Qualcomm's own similar implementation
 ? 14[76] No No32+24[77]0.5–1 MiB2+26.3 ?
Kryo 200 2016 ARMv8-A 2-wide 11–12Yes
7-wide dispatch
Two-levelbig 7 14 / 11 / 10 / 6[78] No No 64 + 32/64? 512 KiB/Gold Core No 4 ?1.8–2.45 GHz
2-wide 8No 0 Conditional+
Indirect branch
prediction
LITTLE 2 8–64? + 8–64? 256 KiB/Silver Core 4 ?1.8–1.9 GHz
Kryo 300 2017 ARMv8.2-A 3-wide 11–13Yes
8-wide dispatch
Two-levelbig 8 10[78] No No 64+64[78] 256 KiB/Gold Core 2 MiB 2, 4 ?2.0–2.95 GHz
2-wide 8No 0 Conditional+
Indirect branch
prediction
LITTLE 28 16–64? + 16–64? 128 KiB/Silver 4, 6 ?1.7–1.8 GHz
Kryo 400 2018 ARMv8.2-A 4-wide 11–13Yes
8-wide dispatch
Yesbig 8 11 / 8 / 7 No No 64 + 64 512 KiB/Gold Prime

256 KiB/Gold

2 MiB 2, 1+1, 4, 1+3 ?2.0–2.96 GHz
2-wide 8No 0 Conditional+
Indirect branch
prediction
LITTLE 2 16–64? + 16–64? 128 KiB/Silver 4, 6  ? 1.7–1.8 GHz
Kryo 500 2019 ARMv8.2-A 4-wide 11–13Yes
8-wide dispatch
Yesbig 8 / 7 No ? 512 KiB/Gold Prime

256 KiB/Gold

3 MiB 2, 1+3  ? 2.0–3.2 GHz
2-wide 8No 0 Conditional+
Indirect branch
prediction
LITTLE 2 ? 128 KiB/Silver 4, 6  ? 1.7–1.8 GHz
Kryo 600 2020 ARMv8.4-A 4-wide 11–13Yes
8-wide dispatch
Yesbig 6 / 5 No ? 64 + 64 1024 KiB/Gold Prime

512 KiB/Gold

4 MiB 2, 1+3  ? 2.2–3.0 GHz
2-wide 8No 0 Conditional+
Indirect branch
prediction
LITTLE 2 ? 128 KiB/Silver 4, 6  ? 1.7–1.8 GHz
Falkor[79][80] 2017[81] "ARMv8.1-A features";[80] AArch64 only (not 32-bit)[80]4-wide10–15Yes
8-wide dispatch
Yes ?8 10 No 24 KiB88[80] + 32500KiB1.25MiB40–48 ? ?
Samsung M1[82][83] 2016 ARMv8-A4-wide13[84]Yes
9-wide dispatch[85]
96 big8 14 No No64 + 322 MiB[86]No4 ?2.6 GHz
M2[82][83] 2017 ARMv8-A 4-wide 100Two-levelbig 10 No No 64 + 64 2 MiB No 4  ? 2.3 GHz
M3[84][87] 2018 ARMv8.2-A6-wide15Yes
12-wide dispatch
228Two-levelbig12 10 No No64 + 64512 KiB per core4096KB4 ?2.7 GHz
M4[88] 2019 ARMv8.2-A 6-wide 15Yes
12-wide dispatch
228Two-levelbig 12 8 / 7 No No 64 + 64 512 KiB per core 3072KB 2  ? 2.73 GHz
M5[89] 2020 ARMv8.2-A 6-wide Yes
12-wide dispatch
228Two-levelbig 7 No No 64 + 64 512 KiB per core 3072KB 2  ? 2.73 GHz
Fujitsu A64FX[90][91] 2019 ARMv8.2-A 4/2-wide 7+Yes
5-way?
Yesn/a 8+ 2*512b[92] 7 No No 64 + 64 8MiB per 12+1 cores No 48+4  ? 1.9 GHz+
HiSilicon TaiShan V110[93] 2019 ARMv8.2-A 4-wide ? Yes n/a 8 7 No No 64 + 64 512 KiB per core 1 MiB per core  ?  ?  ?
Company Core Released Revision Decode Pipeline
depth
Out-of-order
execution
Branch
prediction
big.LITTLE role Exec. ports SIMD Fab
(in nm)
Simult. MT L0 cache L1 cache
Instr + Data
(in KiB)
L2 cache L3 cache Core
configu-
rations
Speed per core (DMIPS/
MHz
[note 1])
Clock rate ARM part number (in the main ID register)

ARMv9-A

More information Company, Core ...
Company Core Released Revision Decode Pipeline depth Out-of-order execution Branch
prediction
big.LITTLE role Exec. ports SIMD Fab
(in nm)
Simult. MT L0 cache L1 cache
Instr + Data
(in KiB)
L2 cache L3 cache Core
configu-
rations
Speed per core (DMIPS/
MHz
[note 1])
Clock rate ARM part number (in the main ID register)
Have it Entries
Arm Holdings Cortex-A510 May 2021 ARMv9-A 3 instructions decoded per cycle 8 stages No N/A (does not support out-of-order execution) Advanced techniques similar to larger cores, specifics not disclosed LITTLE 3 execution ports Yes (supports SIMD instructions) 5nm (common for SoCs using Cortex-A510) No N/A 32 or 64 KB each Configurable, typically 128 KB to 512 KB N/A Typically paired with Cortex-A710 in configurations (e.g., 1+3) Not explicitly stated, but performance uplift of 35% over A55 Up to 2.85 GHz (varies by implementation) Not specified in search results
Arm Holdings Cortex-A710 May 2021 ARMv9.0-A 5 instructions decoded per cycle 10 stages Yes 13 entries Enhanced with larger structures and better accuracy big 5 execution ports Yes 5nm Yes Not specified 64/128 KiB each 256/512 KiB Optional, up to 16 MiB Typically 1+3+4 (big.LITTLE) Not specified in results Up to 3.0 GHz (approx.) Not specified in results
Arm Holdings Cortex-A715 June 2022 ARMv9-A 5 instructions decoded per cycle 15 stages Yes 128 entries Advanced branch prediction capabilities big 4 execution ports Yes 4nm Yes Not specified 64 KiB each 1 MiB 16 MiB (in certain configurations) 1+3+4 or similar setups Not specified, but designed for high efficiency Up to 2.8 GHz Not specified
Arm Holdings Cortex-X2 May 2021 ARMv9-A 6 instructions per cycle 15 stages Yes 2048 entries Advanced, with improved accuracy big 3 execution ports Yes 5nm Yes Not specified 64 KiB each 1 MiB 8 MiB 1+3+4 (X2+A710+A510) Not specified Up to 3.2 GHz Not specified
Arm Holdings Cortex-X3 June 2022 ARMv9.0-A 1 instruction per cycle 15 stages Yes 128 entries Advanced branch prediction capabilities big 4 execution ports Yes 4nm Yes Not specified 64 KiB each 1 MiB 16 MiB 1+3+4 or up to 8+4 Not specified Up to 3.6 GHz Not specified
Close

ARMv9-M

More information Company, Core ...
Company Core Released Revision Decode Pipeline depth Out-of-order execution Branch
prediction
big.LITTLE role Exec. ports SIMD Fab
(in nm)
Simult. MT L0 cache L1 cache
Instr + Data
(in KiB)
L2 cache L3 cache Core
configu-
rations
Speed per core (DMIPS/
MHz
[note 1])
Clock rate ARM part number (in the main ID register)
Have it Entries
Close

ARMv9-R

More information Company, Core ...
Company Core Released Revision Decode Pipeline depth Out-of-order execution Branch
prediction
big.LITTLE role Exec. ports SIMD Fab
(in nm)
Simult. MT L0 cache L1 cache
Instr + Data
(in KiB)
L2 cache L3 cache Core
configu-
rations
Speed per core (DMIPS/
MHz
[note 1])
Clock rate ARM part number (in the main ID register)
Have it Entries
Close

See also

Notes

  1. As Dhrystone (implied in "DMIPS") is a synthetic benchmark developed in 1980s, it is no longer representative of prevailing workloads  use with caution.

References

Loading related searches...

Wikiwand - on

Seamless Wikipedia browsing. On steroids.