****************************************************************************** * Host Intrinsics port for c62, 64, 64plus * * readme.txt * * * ****************************************************************************** Description =========== Host port of c62, 64, c64+ intrinsics. Enables customers to run/prototype code on the host (eg PC, Sparc...) where the debug environment is often richer. New users: Please read the PowerPoint slides in host_intrinsics_overview.ppt. Tests & Dependencies ==================== DSP Side : Tested on :- - K_*\ tests : - c64x tests - designed to be 'real world' testcase usage of intrinsics - work *only in little endian*. Hence will pass when using TI C6000 DSP, cygwin gcc, Linux, MSVC etc. Will not work on a Sparc Unix box (since this is big-endian). - tested with CCS 3.1 C64xx simulator - J_*\ tests - c64Plus tests - designed to be 'real world' testcase usage of intrinsics - endian neutral. Hence will work on PC, Linux, Sparc etc. - tested with CCS 3.2 Early Adopter Release 5 : Himalaya Simulator - C6xSimulator_test.[ch] with C6xSimulator_main.[ch] - small unit tests of each and every intrinsic - endian neutral - tested on PC Microsoft Visual C, Cygwin GCC, PC and Sparc Matlab Host-side : Platforms tested on :- - cygwin with gcc3.3.3 - Linux hosted on a PC - Sparc Solaris workstation (for tests that support big-endian) - MSVC 6.0 How to build/run ================ DSP-side - build the pjt via eg (example for K_iir) - [>] c:\\dosrun.bat - [>] timake K_iir.pjt DEBUG -a - open the relevant CCS C6000 simulator - Load & run Debug\K_iir.out Host-side - type 'make' in any of the example K_*/, J_*/ dirs. - Run ./dbg/.exe - Should get identical results for cn[] (natural C) .v. c[] (C with intrinsics) - Failures will be noted via 'result failure' else a pass will be shown. - All tests return fail = 0 or 1. 0 indicates success, 1 is failure. This enables you to script regression tests. - Note that there are basically 3 -D pre-processor switches to take note of: - -DSIMULATION : this is required on a host-side makefile because it ensures that e.g. function prototypes are given for the host-side intrinsics. On the DSP-side examples, function prototypes should never be listed for the native DSP intrinsics hence the -DSIMULATION is not used. -DLITTLE_ENDIAN_HOST / -DBIG_ENDIAN_HOST : you need to select one or the other depending on your host type e.g. if on an x86 hosted Linux box it will be -DLITTLE_ENDIAN_HOST, whilst if on a Sparc Solaris it will be -DBIG_ENDIAN_HOST. -DTMS320C62X / -DTMS320C64X / -DTMS320C64PX : this host intrinsics port is supplied for c62, c64, and c64plus architectures. - as an example, if you are building for c64x on a PC host you'd add to your makefile: - -DSIMULATION -DLITTLE_ENDIAN_HOST -DTMS320C64X Big-endian testing (host-side) - to enable zero changes to the examples and unit_test when evaluating host intrinsics on a Big-Endian machine (e.g. Sparc), make 'goals' have been added for endian-neutral samples. Just do: - [>] make debug_be Presently the only endian-neutral samples are J_cnv_dec/ and unit_test/ The default goal is still little endian (i.e. 'make' alone suffices here) Misc ==== 1. each of the K_*/ & J_* tests require certain alignment to run on the DSP. In addition, the tests themselves depend on this alignment. Its satisfied by the DATA_ALIGN pragma on TI DSP. However on a host platform this pragma is unknown. To provide alignment on a host platform we added e.g. for an alignment of 8 bytes "__attribute__ ((aligned (8)))" This works on GNU C. It can be easily #ifdef'd out via :- /* If we're not using GNU C, elide __attribute__ */ #ifndef __GNUC__ # define __attribute__(x) /*NOTHING*/ #endif However, we elected to leave the warnings in place on the host build re the DATA_ALIGN pragma. This is to flag users on the host that alignment is important and can affect performance & correctness when moving code to the DSP. Also, non-GNU-C host environments need to find a different way to model the alignment. 2. the amemdN() intrinsics do not meet the characteristics of the C6000 spec at present. For example, the _amemd8 and _amemd8_const intrinsics tell the compiler to read eg an array of shorts with doubleword accesses. This causes LDDW and STDW instructions to be issued for the array accesses on TI C64x/c64x+ DSP. There is no easy way to model this via an intrinsic on the host side given that (i) often these amemdN() intrinsics are on the left hand side of an expression (hence a macro is required instead of a function) (ii) any wrapper macro around amemdN() on the host-side may require users to change large portions of their code - we wanted to avoid this since customers may have code from 3rd parties. 3. In some cases you may not have to modify your DSP code at all to 'port' it to the host and use these host intrinsics. This may occur in cases where you are only using intrinsics which operate on types of 32 bits or less. At these sizes, we have found that basically all (at least the range we tested on) hosts type-sizes are the same as c6x e.g. char = 1 8-bit byte, short = 2 bytes, int = 4 bytes. However, if you use e.g: - long _lssub (int src1, long src2) then long is a 40 bit type. This exists on few, if any, host platforms. Hence to port to a host platform it needs to be of the form: - int40 _lssub(int32 a,int40 b); where int40 is a typedef to 'long' on TI C6000 DSP and 'long long' (most hosts) or '__int64' MSVC 6.0 which sign-extends the result to 64 bits. If the prototype were simply ported without regard for this, it would give wrong results on the host, since long is 32 bits on most hosts. The advice is to grep thru your code for usage of 'long', 'double' etc and take appropriate action ie change to typedefs. Note that if your DSP code uses C99 types such as uint64_t and you are on a platform which supports stdint.h e.g. Linux, cygwin GNU C etc, you may have to do nothing...however its likely you'll still need to account for the 40-bit long case. 4. You should do #include "C6xSimulator.h" #include "C6xSimulator_type_modifiers.h" in your code so that you can use the typedefs to enable your code to run on both TI C6000 DSP & your host platform. Use eg int64_d, int40 typedefs etc. On TI C6000 DSP the only effect will be inclusion of the typedefs. On a host platform the intrinsic prototypes will be included etc. C6xSimulator.h has similarities with c6000\cgtools\include\c6x.h c6x.h is useful on dsp-code to check you've supplied the appropriate argument datatypes to c6000 intrinsics. However it is not suitable for use in a host environment because the types long, float, double etc do not map 1-1 to host environments (e.g. long = 40 bits on c6x but 32 bits on Win). Hence a typedef abstraction is required which C6xSimulator.h supplies. Note that #include "C6xSimulator_type_modifiers.h" is optional. It is a header file which defines/undefines certain keywords abstracted into its own file because different environments may support/not-support different keywords. For example 'restrict' is newly supported in C99. Some environments and compilers may support this whilst others may not. Those that dont should undefine restrict. By virtue of this abstraction, several options exist for the user: - - use this file as is - dont use this file. Instead do defines/undefines in your makefile - use this file and modify it as per the keyword support in your host-env. ****************************************************************************** Release History Table of Contents ****************************************************************************** ============================================================================== Version 0.7 Table of Contents ============================================================================== 0.7-01. Bugs fixed 0.7-02. Known bugs 0.7-01. Bugs fixed 1583 _smpy32 fails with certain inputs 1590 40-bit intrinsics (like lsadd) may return wrong results 1596 Multiply intrinsics mishandle 0 times a negative number 1597 Add support for _rotl intrinsic 0.7-02. Known bugs ============================================================================== Version 0.6 Table of Contents ============================================================================== 0.6-01. Bugs fixed 0.6-02. Known bugs 0.6-01. Bugs fixed 1484 RESOLVED The intrinsic _cmpyr1 works incorrectly 1485 RESOLVED _lmbd intrinsic has the variable m32 datatyped incorrectly 0.5-02. Known bugs ============================================================================== Version 0.5 Table of Contents ============================================================================== 0.5-01. New Features 0.5-02. Bugs fixed 0.5-03. Known bugs 0.5-01. New Features - added installers with click-wrap license for Linux & Windows - added unit tests for itof, ftoi 0.5-02. Bugs fixed 1157 WONTFIX readme version # is wrong for v0.4 1180 INVALID potential MSB issue on _lsadd host intrinsic 1226 FIXED mem8 intrinsic uses int40 instead of int64 1344 FIXED itof and ftoi produce erroneous result 1349 FIXED cnv_dec.c - issue when using -Wall or splint 1456 FIXED _mpyhlu uses x2 instead of x2u in multiplication 0.5-03. Known bugs ============================================================================== Version 0.4 Table of Contents ============================================================================== 0.4-01. New Features 0.4-02. Bugs fixed 0.4-03. Known bugs 0.4-01. New Features - changed the header file hierarchy in c6xsim/ to minimize preprocessor inclusions (and hence potential collisions) in user-code. We now have: - - C6xSimulator_type_modifiers.h : defines/undefines keywords e.g. restrict which is a C99 keyword, and hence some platforms may/may-not support this. Optional hdr file. - C6xSimulator_base_types.h : defines the typedefs needed to attain DSP<->host code portability. Required hdr file (included by file C6xSimulator.h) - C6xSimulator.h : prototypes for host intrinsics. Required hdr file. - _C6xSimulator_priv.h : only used internally bu host intrinsics implementation. Do *not* include this header file in user-code. One side-effect of this if you were using a previous version of host intrinsics is that you may need to do... #include "C6xSimulator.h" #include "C6xSimulator_type_modifiers.h" instead of just... #include "C6xSimulator.h" 0.4-02. Bugs fixed Bugzilla #1122 - definition of "_hill" doesn't match the declaration Bugzilla #1150 - annoying ^M chars in several files 0.4-03. Known bugs ============================================================================== Version 0.3 Table of Contents ============================================================================== 0.3-01. New Features 0.3-02. Bugs fixed 0.3-03. Known bugs 0.3-01. New Features - added in host_intrinsics_overview.ppt to provide FAEs & customers with an overview of purpose and how host intrinsics library works. 0.3-02. Bugs fixed Bugzilla #1071 - fix unions #ifdef protection in C6xSimulatorTypes.h for dsp build 0.3-03. Known bugs ============================================================================== Version 0.2 Table of Contents ============================================================================== 0.2-01. New Features 0.2-02. Bugs fixed 0.2-03. Known bugs 0.2-01. New Features (a) Implementation change to adopt unions and avoid aliasing at -O2. The previous implementation of host intrinsics had code of the following nature: - uint64_d y; int64x2u *py; py = (int64x2u *)&y; py->hi = a; py->lo = b; return(y); This implementation has been changed to adopt unions I.e. union reg64 y64; y64.x2u.hi = a; y64.x2u.lo = b; return(y64.x1u_d); The reason this was done was because the previous code breaks when aliasing is enabled, which is automatic at -O2 on many host compilers. Aliasing occurs when you can access a single object in more than one way, such as when two pointers point to the same object or when a pointer points to a named object. Disambiguating the alias allows the compiler to produce better code since it can retain values in registers. The compiler cannot easily determine if the structure definition (e.g. int64x2u) and the typedef'ed data type, (e.g. uint64_d) are compatible types of the same size. The ISO C spec states that the result is undefined when you dereference a pointer that points to an object of a different (incompatible) type. In contrast, the union works, because it tells the compiler explicitly that they are the same size. Another solution to the aliasing problem is to simply add the -fno-strict-aliasing flag. However, this prevents the host compiler from making certain optimizations since it must assume that any pointer may alias with any other. Hence the code was modified to use unions instead. There are *NO* changes in the interface I.e. users are *NOT* required to make *ANY* changes to their application code to leverage the new, improved host intrinsics. (b) A new folder unit_test was added. It contains the unit test updated to run on any host platform. A sample makefile is supplied for hosts using gcc. If any single intrinsic produces an error a message will be shown indicating the guilty intrinsic and the erroneous value. if all intrinsics produce the correct result a simple 'pass' message will be displayed. Recall that this unit_test is endian neutral. Flip the -D build switch from -DLITTLE_ENDIAN_HOST to -DBIG_ENDIAN_HOST if you are on e.g. a Sparc. 0.2-02. Bugs fixed Bugzilla #874 - readme doesnt give details on -D flags needed 0.2-03. Known bugs ============================================================================== Version 0.00.01 Table of Contents ============================================================================== 0.00.01-01. New Features 0.00.01-02. Bugs fixed 0.00.01-03. Known bugs 0.00.01-01. New Features - First release. 0.00.01-02. Bugs fixed - First release. 0.00.01-03. Known bugs - First release.