|
|
 |
X-Fi Performance And Technology Specs
The following specifications show just how powerful this new platform is and gives an insight into how it is put together.
DSP
| Architecture |
Thread Interleaved Multiple Data (TIMD) |
All 4 sub-processors effectively operate independently. Advanced General Purpose architectures often employ VLIW
(Very Large Instruction Word) to gain parallelism. Optimal efficiency is obtained in VLIW architectures when the
compiler is able to identify parallel execution branches and combine them into single instructions. For static
applications, VLIW can be effective. However, the nature of our applications is generally far from static:
we require dynamic signal flow topologies that enable effect chains of arbitrary complexity. In the final implementation of the X-Fi product, great care has been taken to minimize scheduling hazards that often arise in multi-processor programs. The end result is higher sustainable throughput through the processor.
|
| Subprocessors |
4 |
|
| Subprocessor Architecture |
2xSIMD |
Each sub-processor is able to operate on 2 data streams simultaneously. 2x was specifically chosen as our applications typically operate on channel counts that are multiples of two (stereo, quad, etc.) and optimizes overall DSP utilization. In addition, frequency domain operations often involve complex numbers (which come in pairs: imaginary and real parts). |
| Datatypes |
floating and fixed |
The main advantage of floating point processing is to maintain precision over a wide dynamic range and is particularly well suited for frequency domain applications. The DSP also supports fixed point computation - this facilitates porting of existing algorithms. |
| Instructions |
235 Opcodes |
The instruction set is very rich, enabling a wide variety of algorithms from encoders, decoders, to frequency domain algorithms to be realized on the X-Fi processor. |
| Specialized Audio Instructions |
~60 |
The specialized audio instructions further increase the efficiency of the processor. Of particular interest are the following instructions:
| FADDSUB2 |
floating point add in one datapath, subtract on the other |
| FCMULT |
floating point complex multiply, real result from one datapath, imaginary result from the other |
| FMACM |
R= A-(X * Y) |
| FSINCOS |
floating point sine and cosine approximation |
| INTERP |
SIMD linear interpolate |
| LOG |
logarithm instruction |
| EXP |
anti-logarithm instruction |
|
Clocking and PLL
| Internal clocking |
400MHz |
|
| External Clock masters |
24.576 MHz Crystal
PCI Click 33MHz & 66MHz
S/PDIF Input derived bit clock/4
I²S Input bit clock
I²S Input LR clock
GPIO
|
External clocking is important for interfacing with external gear to achieve sample accurate results - be it recording within the X-Fi chip or recording to an external device.
|
| PLL performance |
Jitter < 60pSecs |
Jitter and its perceived effect is a controversial subject in the community. In general, lower is better, but how much is perceptible remains open. Theoretically jitter will introduce distortion more at the higher frequencies than lower frequencies. For reference, the 10Kx processors have jitter in the neighborhood of 110pSecs.
|
Filter
| Filters |
512 floating point 2nd order IIR |
Filters are used for a variety of applications ranging from typical implementation of effects such as reverbs, EQs or speaker modeling to more sophisticated applications such as 3D spatialization and speaker calibration.
In 3D spatialization filters are often used to achieve occlusion, obstruction, distance, and environmental effects. By placing these filters in hardware, many more 3D sources can be realized with lower host CPU loading.
For music synthesis applications, filters represent one of the basic building blocks. Typical voice architectures for synthesizers will involve a number of filters placed downstream from an oscillator or signal source. Low order filters can be combined in a multitude of ways to realize more complex filter behaviors (parallel, serial, etc).
|
| Filter types |
13 |
Included in the HW is support for the following filter types:
- off
- parametric EQ 4-parameter
- parametric EQ 5-parameter
- low-shelving
- high-shelving
- dual-shelving
- peak (band-pass)
- notch (band-reject)
- low-pass resonator
- high-pass resonator
- direct coefficient 4-parameter
- direct coefficient 5-parameter
|
Mixer
| Audio summation nodes |
256 |
Audio summation nodes represent mixing channels or strips. Effectively with 256 nodes, you could realize a X-channel mixer with Y-stereo aux busses and an Z-channel master bus. The summation nodes combine up to 4096 signals with HW support for scaling and aggregating two separate sources.
|
| Parameter nodes |
1024 |
Parameter nodes are used to combine control parameters often used for synthesis or effects control. For example, an effect parameter may have several modulation sources: key number, pitch wheel, and LFO. Each combination of modulation parameters will consume a node. The parameter nodes are also used extensively in 3D audio game applications. Implementation of Doppler shifting, obstruction and occlusion, and distance effects typically require parameter mixing.
|
| Parameter rampers |
4096 Single segment
1536 Multi segment
4 Multi segment shapes
Linear, pseudo-exponential, and pseudo-logarithmic segment curves are supported
|
The parameter ramping hardware incrementally changes the value of parameters over a period of time, avoiding the large discontinuities that produce audible artifacts such as clicks and pops. The multi-segment rampers can be used to generate the complex envelopes commonly used by synthesizers for pitch and volume control.
|
Tank / Delay line processing engine
| Accesses per sample period |
1024 |
All delay based effects, such as reverbs, choruses, flangers require the tank engine. Delays are implemented by storing values in memory (usually a circular buffer) and dynamically re-computing the address location when a past value is needed. Re-computing the address value is more efficient than moving the data.
|
| Fractional and modulated delay support |
Yes |
For some intermediate effects, fractional delayed values are required. In the X-Fi processor, the HW provides support for interpolation of delayed values. |
SRC (Sample Rate Convertors)
| # interpolators |
256 |
The SRCs act as the gatekeeper for almost all sources into the X-Fi processing world. Analog and those digital signals not at the reference sample rate enter the system through the SRCs. |
| Interpolation type |
A new unique hybrid design |
The hybrid SRC has a quality better than a typical poly-phase FIR implementation with order 100. |
| Supported pitch range |
0-8.0 |
If desired, multiple trips through the SRCs can be used to realize more dramatic pitch shifts |
| Quality |
Super high |
The quality of the SRCs is extremely important to maintaining audio fidelity through the system. With high quality SRCs, a signal can enter the X-Fi processing world with minimal distortion. In the case of X-Fi, the distortion is practically 0. For example, we have results from converting a 997Hz tone from 44.1 KHz to 48 KHz: THD+N -136 dB, +/- 0.00025dB pass-band ripple.
With such quality, the X-Fi processor makes digital into analog again! Not only can you mix and match analog signals, but now you can mix and match digital signals with almost zero artifacts.
Professionally measuring the quality of the SRCs requires expensive test equipment. For example, an AudioPrecision (AP) device costs >$5000
|
| DMA Support |
Yes |
SRCs incorporate an optional DMA engine that can be used to transfer audio data to and from system memory. This feature is essential for enabling ASIO playback and recording capabilities, sound synthesis, and game audio. |
| Advanced Caching |
Tunable cache |
The tunable cache enables the system to compensate for PCI bandwidth issues. As the Intel motherboard chips have altered their behaviors to address networking and other peripheral trends, we have to counteract the detrimental effects their changes have induced for audio. The tunable cache enables us to support more channels and reduce the likelihood of glitching due to starvation. |
I/O and internal channel support
| Input |
4 I²S IN = 8 channels, regardless of sample rate
4 CDIF IN = 32 channels @ 48k, 16 @ 96k, 8 @ 192k
|
Creative/Cambridge Digital Inter-Face (CDIF) is a proprietary digital protocol standard that supports upto 8 channels over a single wire. |
| Output |
4 I²S OUT = 8 channels, regardless of sample rate
4 CDIF OUT = 32 channels @ 48k, 16 @ 96k, 8 @ 192k
|
|
| Internal transport |
4096 Internal audio channels 4096 Internal parameter channels
|
|
Audio processing
| HRTFs / Headphone processing |
HRTF Data collected from multiple sources (UCDavis, Aureal, and Sensaura)
Patents on MacroFX, binaural 3D panning.
48-tap FIR filters, 128 3D sources.
|
We have the best 3D gaming audio ever invented, whether you are playing over 7.1 speakers or even headphones NO OTHER solution will deliver more accurate and believable 3D audio!
|
| X-Fi CMSS-3D |
Patents on ambience generation and directional segregation |
The latest in up-mixing technology. Use CMSS-3D to immerse yourself inside your legacy stereo audio tracks. This feature uses X-Fi's frequency domain processing capabilities.
The approach yields better results, especially with respect to sound field stability, than Dolby Prologic II/IIx. It also achieves more natural timbre and spatiousness than Dolby Headphone. These effects are clearly demonstrable.
|
| Band Splitting |
Super efficient patented band-split signal processing technology employing QMF (Quadrature Mirror Filter) implementation |
Band-Splitting can split high-frequency recordings into up to 4 x 48kHz bands. Effects can then be applied to these separate bands and then the bands rejoined. Without band-splitting, applying full-band effects to high-frequency recordings would be wasteful and restrictive.
Band-splitting can reduce the computational costs by up to a factor of 4 (in the case of 48->192KHz), so is a highly efficient way of allowing users to apply effects to very high sample-rate recordings.
|
| Active Modal Architecture |
The chip's architecture can switch between a number of "modes". |
Mode switching: 3 cards in one. No rebooting required.
The architecture resets itself to support the best features for a specific usage while still allowing base support for other uses.
|
| High quality effects |
|
Support for 24/96KHz and 24/192KHz effects. Absolutely no compromise to your audio quality. |
| ASIO 2.0 |
< 2mSec latency 0% CPU overhead
|
Buffer design such that 0% CPU overhead required to pass audio to/from the host to hardware. |
| WDM / OpenAL |
|
Optimized driver and hardware support to obtain the lowest latency and highest quality interactive audio. |
Evolution of Sound Blaster Processing Power*
 |
Raw Data Path MIPs Note 1 |
Typical Processor MIPs Note 2 |
Internal Audio Channels Available |
Overall Audio Sample Rate & Effects Processing MIPS vs Live! |
No. of Simultaneous Real-time Effects |
No. of Transistors |
| Sound Blaster Pro |
≈1 |
3+ |
- |
0.0001x |
- |
100K |
| AWE 32 (EMU8000) |
67 |
200+ |
- |
0.2x |
- |
500K |
| Live! (10k1) |
335 |
1,000+ |
16 (to Effects Engine) |
1x |
1 |
2M |
| Audigy (10k2) |
424 |
1,250+ Note 3 |
64 (to Effects Engine) |
4x |
4 Note 3 |
4.6M |
| X-Fi |
10340 |
30,000+ |
4,096 (to all Processing Elements) |
67x Note 3 |
8 Note 4 |
51.1M |
Definition of Calculations:
Note 1 - Raw Data Path MIPs
Defined as the number of adds and multiples times the execution frequency that can be applied to the signal data.
This does not include any operations that a typical processor must also perform to manage the signal data in or
through the processor.
Note 2 - Typical Processor MIPs
Defined as an estimate of the processing requirements of a typical processor in 1998 when the Live! was launched.
The estimate is of a typical processor from 1998 programmed to perform the same algorithms or functions in the Live!,
Audigy or X-Fi chip. These processors have certain inefficiencies found when programming a variety of algorithms.
The inefficiencies are typically 3x that of Raw Data path MIPS in a dedicated audio chip such as the Live!, Audigy or X-Fi.
Note 3 - Audigy vs Live!
Audigy has 4 times the power of the Live! to deliver EAX® ADVANCED HD technology. This improvement
in overall effects processing is provided by the 2x increase in the effects and tank processing engine and major optimization of the effects processing architecture. This enables the Audigy to deliver EAX ADVANCED HD effects
without a huge increase in required MIPs processing.
Note 4 - X-Fi Real-Time Effects and Sample Rate Conversion
X-Fi incorporates a new specialized DSP architecture with an instruction set supporting both fixed-point
and floating point data types, and a powerful sample rate conversion (SRC) engine. The SRC engine is
capable of significantly superior audio fidelity at 136dB. The effects engine in previous generations
of audio processors was only limited to fixed-point data types. Furthermore, the SRC engines of
previous generations were only capable of delivering sound quality far lower than 86dB. This new
specialized architecture enables up to 24-Bit/192kHz effects and state of the art algorithms for high
quality audio processing and over 300 times improvement in the SRC engine. The number of real-time
effects and algorithms available are optimized to the number needed in each specific application,
resulting in a non-compromised level of tremendously higher audio quality over Audigy and Live!.
* Due to the differences in architecture, the figures shown are only a rough estimate of performance.
Live!
The design of Live! was a new architecture from the previous AWE line of processors, and was state of the art in 1998. The concept was an architecture that allowed for the first time, Effects processing in real-time with interactive audio to produce 3D audio and Environmental Effects. It was a traditional pipeline design that allowed multi-channel digital audio to flow in and out of the sound engine, interactive effects processing, and multi-channel audio inputs and outputs.

Audigy
The design of Audigy was a major improvement over the Live! with a 2x increase in the dedicated effects processing unit, major optimization of the overall effects processing architecture and added 24-bit ADVANCED HD capabilities. The 2x increase in the dedicated effects and tank processing unit and the major optimization in the architecture yielded a 4x improvement in overall effects processing capabilities without a huge increase in required MIPs processing. These changes provided the ability to run 4 simultaneous effects (3 high quality reverbs and 1 other high quality effect) that delivered the EAX ADVANCED HD 4.0 standard versus just 1 effect on the Live! (1 high quality reverb) that delivered the EAX 2.0 standard. The other addition of the 24-bit ADVANCED HD engine allowed 24-bit playback and recording versus 16-bit with the Live!.
The changes for Audigy from Live! are noted in bold in the diagram below.
 Click for larger image
X-Fi
The X-Fi is designed with a completely new architecture, unlike any other audio processor that has ever been conceived. This new architecture follows the concept of a software design rather than a traditional pipeline design in audio or graphics architectures. The design follows the ring architecture as shown below. This architecture is very flexible in signal routing, with 4096 audio ring channels that route audio to any of the processing elements. The Tank, SRC, Mixer, and DSP processing elements on this ring are far superior and more complex in technology as compared to the entire Audigy audio processor.
| |
X-Fi Raw Data Path MIPs |
| SRC |
7310 |
| Filter |
200 |
| Mixer |
1210 |
| Tank |
440 |
| DSP |
1180 |
| Total |
10340 |

|