The sad state of FAX standardisation

December 31st, 2008

People think of the ITU as a standards body, and many of their documents say something like “ITU-T TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU” on the front. However, when it comes to the actual document title they say something like “ITU-T Recommendation T.30″. They are nothing more than recommendations, widely treated with contempt by industry when it comes to compliance. Some ITU-T recommendations, like the G.168 recommendation, are rightly ridiculed in the industry. G.168 is a test spec for telephone line echo cancellers. It contains a lot of good tests, which a well designed canceller should pass. Used as a guide for the product developer it is very useful. As a formal test spec it suffers the minor drawback that you can simply mute the receive channel and pass the letter of most of the tests, while completely ignoring the spirit. Most ITU-T recommendations have areas where they are vague, and open to interpretation. Many have areas which hardly any real world equipment complies with. In this rant I want to look at the recommendations related to FAX.

Let’s start with the modem documents related to FAX - V.21, V.27ter, V.29, V.17 and V.34. V.21 is pretty simple, and there isn’t too much to get wrong implementing that. V.27ter, V.29 and V.17 have several areas where very few real world products match what the ITU-T documents say they should send. For example, they should send a period of all 1’s at the end of their training data, so the receiver can test if the training was successful. It is so common for modems to send something else, that no receiver can realistically use these test periods. This means a developer of a new implementation has to test against numerous products in the field, find the quirks they have, and make sure their own implementation tolerates all the abuses. This is very time consuming, and is a huge drag on getting solid products into the field. Some real world implementations just can’t be tolerated. For example, most modern Canon FAX machines ship with a V.29 modem so broken its signal is almost impossible to decode accurately. Most of the millions of thermal paper FAX machines still in use only support V.29 and V.27ter, so Canon machines will usually send FAXes to them at 4800bps. Canon seem to get away with this because most business FAX machines will communicate at the faster V.17 speeds, and never even try V.29. Maybe its unfair to single out Canon, as pretty much every FAX machine makers has many skeletons in their closet. I can’t say much about V.34, as I have no personal experience of implementing V.34 FAX. However, it seems to have taken a long time to shake out the V.34 FAX designs, so they interoperate well. It doesn’t sound like a solid document everyone adheres to.

The T.4 recommendation seriously lacks focus. It covers a hotch-potch of things related to FAX, although its most important function is to define the main image compression schemes for FAX. It does this suprisingly well. Altough there are a couple of variations in implementation out there, if you build a compressor and decompressor strictly to this document it should just work.

The T.30 recommendation is the heart of modern FAX. This was put together by a group of FAX industry people who thought compatibility was a four letter word (yes, their arithmetic really does seem to have been that good :-) ). At the end of the 70s it was really important for these guys to get their act together over compatibility, in order for a FAX machine industry to get off the ground. They approached this with the enthusiasm of a man going to the guillotine. They produced one of the classically woolly documents. You’ll find more about how to build a usable FAX machine from studying the tests in TSB85 (A test spec for T.30 compliant FAX machines from the TIA) than from T.30 itself. T.30 is a mass of half finished statements, which fail to tie down exactly what they mean. It hasn’t got any better over time. Try looking at the latest additions and tell me precisely what their definition of an Internet aware FAX machine really is. FAX today works because of endless interoperability testing a long time ago, and people sticking to what they found worked. Its not spec based. Its folk knowledge based.

The T.38 recommendation is the key one for making traditional FAX machines interoperate with the IP world. It is much newer than the older specs, and you might expect people would have learned their lesson, and made sure it really told the implementor what to send and what to expect. Dream on. This document makes T.30 look like a robust standard. In some places the original was so bad there are a few clarifications in later revisions. These clarifications seem merely intended to poke fun at people having problems implementing T.38. They clarify nothing, and introduce fresh new confusion. Implementing T.38 is mostly a matter of looking at what rubbish other people send, and trying to fit in with it… and its a mess. Its 10 years since the first revision of T.38 saw the light of day, and interoperability is still a farce. When checking if your box might be compatible with another, don’t just check models. You will need to check the exact software revisions they are using too. This is not just the case with $30 ATA’s, where the suppliers really don’t give a damn about quality. Its true of the big infrastructure makers too.

So, what should an earnest engineer, trying to do a good job, do? Cry, would be my first recommendation. Once you’ve got that out of your system, try to follow basic sound engineering - be cautious about what you send, and expect to receive a lot of garbage. Send precisely what the recommendation says, where it says something precise. A lot of things are not defined very precisely, so try to send to the lowest common denominator, and see how it works out in the real world - test, test and test again, with every piece of kit you can find. On the receive side, be prepared to tolerate any weird signals you can, and test, test and test again.

Denial - its not just a river in Egypt

December 7th, 2008

Have you ever worked with the T.38 FAX over IP specification? Its more of a vague outline than a spec. Almost every implementation produces its own unique and distinctive pattern of messages, and message timing on the wire. Its not because the implementers suck (although some probably do). Its because this so called spec is so vague and open ended. Ask anyone who has tried to set up large scale T.38 usage, and they’ll tell you a tale of woe. Of how many boxes don’t interoperate, and how many that do only do so when you pick the right firmware revisions.

The SIP Forum recently held its first “Fax-over-IP Interoperability Workshop”. You might guess from the title that people are having problems, and need to hammer them out in a neutral forum. You can find some material from the workshop here. If you look through that, you’ll get a mixed picture. However the good people at Commetrex, in this, their latest newsletter, tell us the consensus view at the meeting was that there are no serious interoperability issues with T.38 these days. I find this astonishing. If you haven’t met Commetrex before, they are a major provider of T.38 engines to equipment makers. They also run a free T.38 interoperability lab service, where people can get their T.38 implementations checked out for free. These are definitely people who should know the score.

Its time for config files to come out of the wiring closet.

June 4th, 2008

Config files have a tough time. They start out trying to fit in by looking really ordinary. They try to look like

parameter = value

or

parameter: value

but they need more freedom to be their true selves. So, they try to add a little order to their lives. They add a little structure with section headings like

[section1]
parameter = value
[section2]
other-parameter = other-value

but they still feel constrained. Some of their Native American breathren, from the Apache tribe, go a little further, with

<section>
  parameter = value
</section>

but face scorn for being a little on the wild side. Well, its time they broke free, came out of the wiring closet, and admitted they were XML all along. Can you imagine the freedom it brings to finally admit that what you’ve always wanted to say to the world is

<?xml version="1.0"?>
<!DOCTYPE global-tones SYSTEM "../tones.dtd">
<global-tones>
  <tone-set country="United States" uncode="us">
    <dial-tone>
      <step freq="350+440 || 480+720" level="-13 [1.5dB]"/>
    </dial-tone>
    <dial-tone domain="PABX">
      <step freq="350+440 || 480+720" level="-16 [0.75dB]"/>
    </dial-tone>
    <dial-tone type="recall">
      <step cycles="3">
        <step freq="350+440" level="-13 [1.5dB]" length="0.1"/>
        <step length="0.1"/>
      </step>
      <step freq="350+440" level="-13 [1.5dB]"/>
    </dial-tone>
    <dial-tone domain="PABX" type="recall">
      <step cycles="3">
        <step freq="350+440" level="-16 [0.75dB]" length="0.1"/>
        <step length="0.1"/>
      </step>
      <step freq="350+440" level="-16 [0.75dB]"/>
    </dial-tone>
    <ringback-tone>
      <step cycles="endless">
        <step freq="440+480 || 380+460" level="-19 [1.5dB]" length="2.0"/>
        <step length="4.0"/>
      </step>
    </ringback-tone>
    <ringback-tone domain="PABX">
      <step cycles="endless">
        <step freq="440+480" level="-16 [1.5dB]" length="1.0"/>
        <step length="3.0"/>
      </step>
    </ringback-tone>
    <busy-tone>
      <step cycles="endless">
        <step freq="480+620 || 480+720" level="-24 [1.5dB]" length="0.5"/>
        <step length="0.5"/>
      </step>
    </busy-tone>
    <busy-tone domain="PABX">
      <step cycles="endless">
        <step freq="480+620 || 480+720" level="-21 [1.5dB]" length="0.5"/>
        <step length="0.5"/>
      </step>
    </busy-tone>
    <congestion-tone>
      <step cycles="endless">
        <step freq="480+620 || 480+720" level="-24 [1.5dB]" length="0.25"/>
        <step length="0.25"/>
      </step>
    </congestion-tone>
    <congestion-tone domain="PABX">
      <step cycles="endless">
        <step freq="480+620 || 480+720" level="-21 [1.5dB]" length="0.25"/>
        <step length="0.25"/>
      </step>
    </congestion-tone>
    <call-waiting-tone>
      <step cycles="endless">
        <step freq="480+620" level="-13 [1.5dB]" length="0.3"/>
        <step length="10.0"/>
      </step>
    </call-waiting-tone>
    <call-waiting-tone domain="PABX" type="station-call">
      <step freq="480+620" level="-16 [1.5dB]" length="0.3"/>
    </call-waiting-tone>
    <call-waiting-tone domain="PABX" type="outside-call">
      <step cycles="2">
        <step freq="480+620" level="-16 [1.5dB]" length="0.1"/>
        <step length="0.1"/>
      </step>
    </call-waiting-tone>
    <call-waiting-tone domain="PABX" type="urgent-call">
      <step cycles="3">
        <step freq="480+620" level="-16 [1.5dB]" length="0.1"/>
        <step length="0.1"/>
      </step>
    </call-waiting-tone>
    <special-information-tone>
      <step freq="950" level="-13 [1.5dB]" length="0.33"/>
      <step freq="1400" level="-13 [1.5dB]" length="0.33"/>
      <step freq="1800" level="-13 [1.5dB]" length="0.33"/>
    </special-information-tone>
    <warning-tone type="operator-intervening">
      <step freq="440" level="-13 [1.5dB]" length="2.0"/>
      <step length="10.0"/>
      <step freq="440" level="-13 [1.5dB]" length="0.5"/>
      <step length="10.0"/>
    </warning-tone>
    <warning-tone type="operator-intervening" domain="PABX">
      <step freq="440" level="-13 [1.5dB]" length="1.5"/>
      <step length="8.0"/>
      <step freq="440" level="-13 [1.5dB]" length="0.5"/>
      <step length="8.0"/>
    </warning-tone>
    <waiting-tone domain="PABX">
      <step freq="440" level="-13 [1.5dB]" length="0.3"/>
      <step length="10.0"/>
    </waiting-tone>
    <record-tone>
      <step freq="1400" level="-13 [1.5dB]" length="0.5"/>
      <step length="15.0"/>
    </record-tone>
    <executive-override-tone domain="PABX">
      <step freq="440" level="-14 [1.5dB]" length="3.0"/>
    </executive-override-tone>
    <intercept-tone domain="PABX">
      <step freq="440" level="-13 [1.5dB]" length="0.25"/>
      <step freq="620" level="-13 [1.5dB]" length="0.25"/>
    </intercept-tone>
    <confirmation-tone>
      <step freq="350+440" level="-13 [1.5dB]" length="0.1"/>
      <step length="0.1"/>
      <step freq="350+440" level="-13 [1.5dB]" length="0.3"/>
    </confirmation-tone>
    <confirmation-tone domain="PABX">
      <step cycles="2">
        <step freq="350+440" level="-16 [1.5dB]" length="0.1"/>
        <step length="0.1"/>
      </step>
      <step freq="350+440" level="-16 [1.5dB]" length="0.1"/>
    </confirmation-tone>
  </tone-set>
</global-tones>

Doesn’t that feel good? You can build your own little nest where you feel at home. You are no longer limited in the depths you can express.

Narrow band detection

December 18th, 2007

Detecting the presence of narrow band signals - not the actual signal, but merely that the signal in the channel is not broad band - is a vital or useful task in a number of telephony DSP situations. Really robust echo cancellers need to detect narrow band energy, and freeze their adaption when it is present. Here the compute cost is secondary to robust operation, but keeping the compute down is obviously important. If a really lightweight narrow band detector is practical, then things like supervisory tone detectors could potentially be enabled only when the narrow band detector says there is something suspicious in the channel. That might reduce their average compute load.

There are various obvious ways to assess narrowbandiness. The auto-correlation function of the signal will be peaky if the energy is predominantly narrow band. That peakiness is not hard to detect, but the total compute load is quite high. A really low compute method has eluded me, to date. The Teager Kaiser Energy Operator (TKEO) is a very low compute calculation, which produces a constant output for a single tone of constant amplitude. For a single tone, x(n) = A cos(ωn + φ), TKEO is simply:

Ψ(x(n)) = x(n)2 – x(n)x(n – 2) = A2 sin2(ω)

Where ω is the usual suspect, and A is the amplitude of the tone. The following modified form is more robust, though it produces the same answer for more than one frequency - not really a problem if you are only trying to detect that some fairly constant amplitude narrow band energy is present:

Ψk(x(n)) = x(n – k)2 – x(n)x(n – 2k) = A2 sin2(kω)

Where k is an integer constant. The larger k is, the more frequencies map to the same value, but the less noise affects the result. If there are two tones (typical maximum for supervisory tones) it produces a sine wave output at the difference frequency. That’s not hard to assess, so TKEO looks like a good basis for a detector. The snag is it is horribly sensitive to noise. The up side is each calculation involves only 3 samples. The down side is each calculation involves only 3 samples. You are highly dependent on the accuracy of those 3 samples, and even small amounts of noise mess up the answers rather badly. If you apply a simple pole or two of LPF to the output of the TKEO you can damp the effects of noise quite a lot. However, I can’t seem to come up with a robust scheme that remains low compute.

Freescale have some application notes (e.g. http://www.freescale.com/files/dsp/doc/app_note/AN2384.pdf) on the web describing tone detection using TKEO. They even have what appears to be a detailed description of the algorithms they use. However, when I try modelling what they describe, I can’t seem to get an adequate result. Lots of application notes are complete drivel, describing things that can never work. These Freescale notes are tantalisingly plausible :-) . Of course, if it works well for them, they have probably created a patent minefield for people like me, following behind.