Everything You Need to Know about IEM’s

Posted on February 21, 2011




In-Ear-Headphones can be divided into 2 categories, Earbuds and In-Ear-Monitors. The scope of this post will focus on in-ear monitors.

In-Ear-Monitors (IEM’s) differ from Earbuds by the fact that they seal the ear canal completely as opposed to earbuds which simply sit outside the ear canal. By doing this, they are able to achieve better noise isolation and in most cases, more accurate sound due to the more efficient transfer of sonic energy into the ear canal. Better noise isolation also means the user would be less inclined to turn the output volume to dangerous levels to drown out external noises, however this also poses a health risk, as the user must rely almost entirely on sight in high risk areas (traffic for example).

Among the first IEM’s to come to market were the Etymotic ER-4 series of canalphones. Most consumer in-ear-headphones were earbuds or over-ear-headphones which offered lower levels of noise isolation and a relatively low accuracy. The Etymotic ER-4S series strived for balanced, accurate and clear sound and were so successful that they are still being sold in much the same form today. Unfortunately, for many consumers, the “balanced” and “accurate” sound the ER-4 series produced translated as flat and treble heavy, which lead Etymotic to produce the ER-4P which had a slightly elevated bass response consumers are used to.

The Etymotic ER-4P. Legendary,

Noise Isolation

Most IEM’s require a good seal for them to deliver the full ranges of sound they are capable of.

Foam Tips provide the best seal as they expand into the ear canal. Comply tips are usually preferred however Shure’s Olive tips are also a favourite amongst most audiophiles. Foam tips tend to accentuate bass harmonics and are usually preferred for slightly treble heavy IEM’s such as the Etymotic HF3 and Shure SE535. Due to their porous nature, foam tips have a very limited life-span with those from the likes of Comply lasting only around a month whilst Shure Olives up to a year.

Silicone Flange tips are a common consumer favourite for sheer convenience. They are quick and easy to insert into the ear canal and provide a reasonable seal. They are also extremely durable, offering a life often exceeding that of the IEM itself. Unlike foam tips however, they don’t provide the accentuated bass and noise isolation isn’t as good.

Triple-Flanged Silicone Tips were made popular by Etymotic and are supposed to provide superior noise isolation compared to traditional single-flanged tips. While they do offer very good noise isolation, many say that they accentuate treble frequencies and cause sibilance (high frequency hiss) in balanced armature IEM’s.

Driver Technologies

Balanced Armature

Balanced armature drivers are common in most high end IEM’s. Although more expensive than dynamic drivers, due to their design they are able to reproduce sound more accurately, especially in the treble region. A typical balanced armature driver consists of an armature suspended between two permanent magnets. A current is then passed through a coil spun around the armature which causes it to become attracted to either one of the magnets. The armature is connected to a shaft which joins onto the diaphragm. It is this movement happens thousands of times per second to reproduce the sound that we hear.


  1. Superior Treble performance compared to dynamic drivers
  2. Lower drive power
  3. Can be tuned for accurate reproduction
  4. Multi-driver set-up can have a range far greater than dynamic drivers
  5. Much smaller than dynamic drivers


  1. Relatively fixed design means the driver moves less air than a dynamic driver
  2. More expensive
  3. Requires good seal to deliver low frequencies effectively
  4. A single balanced armature driver has a lower frequency range than a single dynamic driver

Balanced Armature Drivers include:

  1. Shure SE315/SE425/SE535
  2. Westone IEM’s
  3. Etymotic ER Series and HF3/HF5
  4. Apple In-Ear Headphones


Dynamic drivers in IEM’s work in a similar way to the drivers found in over-ear headphones and larger speakers. The diaphragm is mounted on a voice coil to which a current is applied. The entire section is then either attracted or repelled from a permanent magnet which moves the diaphragm and produces sound. Dynamic drivers are relatively uncommon in IEM’s (although are the staple driver type in ear-buds). Because they are able to move more air, bass reproduction is better than balanced armature drivers, however they cannot move as fast, meaning intricate details (including those within the bass range) may be lost.


  1. Cheaper to produce
  2. “Warmer” sound
  3. Better bass response
  4. Wider range per driver


  1. Tuning for accuracy is difficult
  2. Treble roll-off is prevalent in many dynamic IEM’s
  3. Larger drivers perform better, but take up considerably more space than BA drivers

Dynamic IEM’s include:

  1. Sennheiser CX-200/CX-300
  2. Sony MDR Series

Moving Armature

The Grado GR8 is an example of a Moving Armature IEM

Moving armatures are relatively new technology which aims to bring the benefits of fast moving balanced armature drivers with dynamic drivers. Moving armature drivers work in a similar principle to balanced armature drivers, but contain a relatively large diaphragm. A single moving armature driver is able to reproduce audio to the same level of quality found in multiple balanced armature IEM’s. Notable examples include the Grado GR8/GR10’s and the Japanese made Ortofon e-Q5.

Frequency Characteristics

Most consumer IEM’s can be divided into categories by their frequency response. It is worth noting that while an important factor, frequency response is only one factor affecting the quality of IEM’s, and response alone cannot guarantee the quality of reproduction. However it provides a useful method of dismissing IEM’s whose frequency response is modelled towards an undesirable characteristic.


This is the standard for uncoloured reproduction. With this type of response, no frequency is favoured over another and the sound will appear as intended by mastering. High-end audio equipment will try to achieve this, however no speaker or IEM has actually reproduced a perfectly flat response. Having said this, IEM’s such as the Etymotic ER-4S come very close.

It is worth noting that while a flat response “on paper” for a room speaker should appear flat, IEM’s must favour some frequencies over others to achieve “perceived flat response”. This is because mastering assumes that the sound will be played over room speakers. Room speakers have a noticeable characteristic of somewhat lacking treble, even if they are producing a “balanced sound”. This is because treble frequencies are the first to be absorbed by the air, and by the time the sound reaches our ears, much of the high frequency energy would have been absorbed. Mastering takes this into account and “boosts” high frequencies to overcome this.

IEM’s however do not have this issue and very little treble is absorbed by the small distance in the ear canal. This is why most reference IEM’s tend to lower treble frequencies and boost bass to appear to sound flat.

U Response

Also known as the “Smile” response or “consumer sound”. This is what many modern consumers look for and most consumer earbuds and IEM’s reproduce a sound similar to this. The effect is very noticeable in car stereos and especially consumer IEM’s from the likes of Sennheiser and Beats by Dr.Dre where a “beefy” bass response is coupled with a bright treble. Midrange is usually “towards the back” although it may have several spikes to boost vocal reproduction. This response is also known as the ‘boom & tizz’ response.

N Response

IEM’s which produce this response are said to be mid-heavy. Notable examples include the Etymotic MC5 which places much of the focus on mid-range and sees a noticeable treble roll of and lowered bass response. This type of response is not popular in the consumer market.

Bass Slope

IEM’s with this response are well suited to genres of music which benefit from a bass kick, drum and bass, dubstep and house for example. This response is also very popular amongst the general public and many consumers will happily judge the performance of a pair of IEM’s based solely on its bass response alone.

Treble Slope

The treble slope is generally seen in high end IEM’s geared towards mastering and critical listening. Having an elevated treble allows for greater perceived detail. The treble slope is also favoured by musicians who require these to achieve a perceived flat response. This is because musicians tend to lose their high frequency hearing first due to repeated exposure to loud sounds.


Soundstage refers to the 3D space the IEM creates. Generally, the further away the speakers from the ears, the larger the soundstage is going to be. IEM’s all have an inherent problem of being very close to the ear drum. This means that the sound produced by the driver is not “shaped” by the ear and ear canal before it hits the eardrum as we are used to, which can cause the sound to appear as if it was coming from the centre of the head.

Manufacturers have counteracted this problem by modifying the response of the IEM’s and various software enhancements can also create the illusion of a larger soundstage by leaking the left and right channels into each other or amplifying differences in stereo sounds.

Decay Times

Imagine an IEM reproducing a loud “pop!”. The sound should appear, then stop immediately after the it has ceased from the original recording. If the IEM continues to resonate even after the sound has stopped, it will “colour” the sound with elements not present in the original recording. This is not a desirable trait in IEM’s and should be kept to a minimum. Balanced armature drivers are not affected by this as much as dynamic drivers however, and the problem is much more pronounced in room speakers than on IEM’s.

A typical decay time graph showing the time taken for resonances to stop

Bass Port/Reflex

If a driver is placed in a sealed enclosure, it will have an inherent tendency to move back to its original position after it has moved to equalise the air pressure inside the enclosure. While this can be favourable for producing sounds more accurately and reducing unnecessary resonances (thus avoiding colouring the sound), it increases the power required to move the driver (and to overcome the difference in air pressure) and limits the drivers bass response.

One way for manufacturers to overcome this problem is to put “bass ports” onto the enclosure. Bass ports essentially allow the equalisation of air inside and outside the enclosure and allowing the driver to be more “floppy”. This increases bass response and also lowers the power needed for the driver to produce sound. Adding vents onto the enclosure design however reduces the accuracy of the driver as it is now more “floppy”, although this problem can be overcome by “tuning” the bass ports.

Tuning a bass port essentially means placing the vent in a strategic position on the enclosure to ensure that air is only equalised when it should be (i.e. When the driver is reproducing low frequencies) and to behave like a sealed enclosure when the driver is reproducing higher frequencies. In effect, tuning the bass port allows the driver and enclosure to get the best of both worlds.


Frequency Range

Which IEM is better?

  1. 15hz – 24KHz
  2. 22Hz – 16Khz

If you answered (A), then you’re wrong. If you answered (B) however, you are also wrong. Many mis-informed consumers judge the quality of a pair of IEM’s on frequency response alone, and often, frequency response is the only specification on the packaging (aside from sensitivity and impedance) which gives any indication of the performance of a pair of IEM’s. So what’s going on here?

Frequency range only provides one piece of information; the range of frequencies an IEM can cover. It gives no indication to whether the IEM will reproduce all frequencies equally or whether the IEM will distort certain “difficult” frequencies. In fact, most IEM’s whose frequency response covers the <100Hz range will distort these frequencies or ignore them altogether. The same can be said with treble frequencies >6KHz where many IEM’s (especially those which are dynamic based) will distort.

Frequencies below 30Hz and over 16Khz

The human ear can hear sounds at the limits of 20Hz -20KHz. Therefore we can assume that a pair of IEM’s which cover the 20Hz to 20KHz range will suffice, or not. The truth is, most people can only hear a range from 100Hz to 16KHz range, and even then, treble frequencies close to 16KHz would only be heard as “high” with very little actual perceived detail. This can explain why the Etymotic ER-4S has an upper limit of 16KHz, but still reproduces more perceived treble detail than cheaper IEM’s which claim to reproduce frequencies of up to 22KHz.

How Splurgebook tests IEM’s

IEM’s are put through two tests to determine overall sound quality and other factors such as comfort etc.

General Use

IEM is plugged directly into an iPhone 3GS and Samsung YP-P2. No equalisation is used and all sound “enhancements” are turned off. Only FLAC/ALAC and 320Kbp/s MP3 (LAME 3.98) are used as source audio.

Critical Listening

IEM is plugged into a FiiO E7 (WM8740) portable amp which is line-fed from an iPhone 3GS (Airplane Mode). All enhancements turned off. Only ALAC is used.

Noise Isolation

Tested in a quiet room with noise at 35-37db SPL
Tested in an outside environment with noise at 45-60db SPL