TúHúE FúPúU IúNúSúTúRúUúCúTúIúOúN TúUúTúOúRúIúAúL You might be thinking why write/read a tutorial on FPU instructions,they wont help in coding viruses. Nor will they help in ezing my life, so if thats what you really think then you are very wrong. These instructions help a lot. As the name suggests they not only handle integers, but we can play with decimals too. Believe me these instructions will help you create viruses which will go virtually undetected by the AVs (well probably :)).How? Just write a encryptor/decryptor or poly using/for these. Because half of the AV's emulators hang on emulating them or just think the file is not a infec- ted one, your creations will go unnoticed. So lets get on to business: Its better you check whther your system handles FPU instructions. For this you need to check whther a coprocessor is installed. This is done by a simple instruction: SMSW EAX Check the lower bit, if its set we have a coprocessor. Otherwise lets go watch some movie. Coming to coprocessors, if a processors are 386, 486, 586....then a coprocessors are 387, 487, 587....and they handle FPIs. Now comes the IEEE standard 754, these are Intel's standard for making the FPU's understand the floating point (decimal) numbers. They are coded as follows: S-Sign, E-Exponent, F-Fraction or simply S,E,F. The length of S is one bit (0 if the operand is positive and 1 otherwise). Length of F is equal to: All_Bits - E_Length - 1. Usually the floating point numbers are expressed as: S, 2^E*F For our ease, the FPU provides us with its "stacks" to do the calculations or store the values, etc..These are: ST(0), ST(1), ST(2),....ST(9) ST(0) is also reprsented as ST. These stacks hold the floating point nu- mbers while the operation is being carried out. Now lets see what happens when you load these registers. On the first load things get loaded to ST and after each load the stack registers increase and the operand which you loaded first also goes to higher regs, till ST(9). Something like this: load a : ST(0) = a; ST(1) = 0; ST(2) = 0..... load b : ST(0) = b; ST(1) = a; ST(2) = 0..... load c : ST(0) = c; ST(1) = b; ST(2) = a..... I hope things might be getting clear, if not read again. The most used registers are ST(0) and ST(1). Below is a table of most used FPU instructions taken from TECHHELP. ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿ ³ Data Transfer and Constants ³ ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ FLD src Load real: st(0) = src (mem32/mem64/mem80) FILD src Load integer: st(0) = src (mem16/mem32/mem64) FBLD src Load BCD: st(0) = src (mem80) FLDZ Load zero: st(0) = 0.0 FLD1 Load 1: st(0) = 1.0 FLDPI Load pi: st(0) = ã (ie, pi) FLDL2T Load log2(10): st(0) = log2(10) FLDL2E Load log2(e): st(0) = log2(e) FLDLG2 Load log10(2): st(0) = log10(2) FLDLN2 Load loge(2): st(0) = loge(2) FST dest Store real: dest = st(0) (mem32/mem64) FSTP dest dest = st(0) (mem32/mem64/mem80) and pop stack FIST dest Store integer: dest = st(0) (mem32/mem64) FISTP dest dest = st(0) (mem16/mem32/mem64) and pop stack FBST dest Store BCD: dest = st(0) (mem80) FBSTP dest dest = st(0) (mem80) and pop stack ÚÄÄÄÄÄÄÄÄÄ¿ ³ Compare ³ ÀÄÄÄÄÄÄÄÄÄÙ FCOM Compare real: Set flags as for st(0) - st(1) FCOM op Set flags for st(0) - op (mem32/mem64) FCOMP op Compare st(0) with op (reg/mem); pop stack FCOMPP Compare st(0) with st(1) and pop stack twice FICOM op Compare integer: Set flags for st(0) - op (mem16/mem32) FICOMP op Compare st(0) with op (mem16/mem32) and pop stack FTST Test for zero: Compare st(0) with 0.0 FUCOM st(i) Unordered Compare: st(0) to st(i) [486] FUCOMP st(i) Compare st(0) with st(i) and pop stack FUCOMPP st(i) Compare st(0) with st(i) and pop stack twice FXAM Examine: st(0) (set condition codes) ÚÄÄÄÄÄÄÄÄÄÄÄÄ¿ ³ Arithmetic ³ ÀÄÄÄÄÄÄÄÄÄÄÄÄÙ FADD Add real: st(0) = st(0) + st(1) FADD src st(0) = st(0) + src (mem32/mem64) FADD st(i),st st(i) = st(i) + st(0) FADDP st(i),st st(i) = st(i) + st(0) and pop stack FIADD src Add integer: st(0) = st(0) + src (mem16/mem32) FSUB Subtract real: st(0) = st(0) - st(1) FSUB src st(0) = st(0) - src (reg/mem) FSUB st(i),st st(i) = st(i) - st(0) FSUBP st(i),st st(i) = st(i) - st(0) and pop stack FSUBR st(i),st Subtract Reversed: st(0) = st(i) - st(0) FSUBRP st(i),st st(0) = st(i) - st(0); pop stack FISUB src Subtract integer: st(0) = st(0) - src (mem16/mem32) FISUBR src Subtract Rvrsd int: st(0) = src - st(0) (mem16/mem32) FMUL Multiply real: st(0) = st(0) * st(1) FMUL st(i) st(0) = st(0) * st(i) FMUL st(i),st st(i) = st(0) * st(i) FMULP st(i),st st(i) = st(0) * st(i) and pop stack FIMUL src Multiply integer: st(0) = st(0) * src (mem16/mem32) FDIV Divide real: st(0) = st(0) ÷ st(1) FDIV st(i) st(0) = st(0) ÷ t(i) FDIV st(i),st st(i) = st(0) ÷ st(i) FDIVP st(i),st st(i) = st(0) ÷ st(i) and pop stack FIDIV src Divide integer: st(0) = st(0) ÷ src (mem16/mem32) FDIVR st(i),st Divide Rvrsd real: st(0) = st(i) ÷ st(0) FDIVRP st(i),st st(0) = st(i) ÷ st(0) and pop stack FIDIVR src Divide Rvrsd int: st(0) = src ÷ st(0) (mem16/mem32) FSQRT Square Root: st(0) = sqrt st(0) FSCALE Scale by power of 2: st(0) = 2 ^ st(0) FXTRACT Extract exponent: st(0) = exponent of st(0) and gets pushed as st(0) = significand of st(0) FPREM Partial remainder: st(0) = st(0) MOD st(1) FPREM1 Partial Remainder (IEEE): same as FPREM, but in IEEE standard [486] FRNDINT Round to nearest int: st(0) = INT( st(0) ), depends on RC flag FABS Get absolute value: st(0) = ABS( st(0) ), removes sign (changes to postive) FCHS Change sign: st(0) = -st(0) ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿ ³ Transcendental ³ ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ FCOS Cosine: st(0) = COS( st(0) ) FPTAN Partial tangent: st(0) = TAN( st(0) ) FPATAN. Partial Arctangent: st(0) = ATAN( st(0) ) FSIN Sine: st(0) = SIN( st(0) ) FSINCOS Sine and Cosine: st(0) = SIN( st(0) ) and is pushed to st(1) st(0) = COS( st(0) ) F2XM1 Calculate (2 ^ x)-1: st(0) = [2 ^ st(0)] - 1 FYL2X Calculate Y * log2(X): st(0) is Y, st(1) is X; this replaces st(0) and st(1) with: st(0) * log2( st(1) ) FYL2XP1 Calculate Y * log2(X+1): st(0) is Y; st(1) is X; this replaces st(0) and st(1) with: st(0) * log2( st(1)+1 ) ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿ ³ Processor Control ³ ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ FINIT Initialize FPU FSTSW AX STore Status Word in EAX., ie. EAX = MSW FSTSW dest dest = MSW (mem16) FLDCW src LoaD Control Word: FPU CW = src (mem16) FSTCW dest STore Control Word: dest = FPU CW FCLEX Clear exceptions FSTENV dest STore ENVironment: stores status, control and tag words and exception pointers into memory at dest FLDENV src LoaD ENVironment: loads environment from memory at src FSAVE dest Store FPU state: store FPU state into 94-bytes at dest FRSTOR src Load FPU state: restore FPU state as saved by FSAVE FINCSTP Increment FPU stack ptr: st(6)<-st(5); st(5)<-st(4),...,st(0) FDECSTP Decrement FPU stack ptr: st(0)<-st(1); st(1)<-st(2),...,st(7) The above 2 instuctions put the corresponding values too in the inc/dec stacks. FFREE st(i) Mark reg st(i) as unused FNOP No operation: st(0) = st(0), equivalent to nop. WAIT/FWAIT Synchronize FPU & CPU: Halt CPU until FPU finishes current opcode. FXCH - eXCHange instruction st(0) <- st(1) st(1) <- st(0) similar to xchg. Suppose if you want to calculate something like: tan(cos(sin(a*b+c*d)))/4 What do we do, simple, see below: First things first, you'll below that for loading I have used FILD instead, b'coz FLD expects an integer in IEEE 754 format. What I mean to say is that , ssuppose you wanna load 12345678h in a variable say Var, now when you use FLD Var gets loaded with something like 1.2345e-67 (pretty nasty, huh?).But when you use FILD, Var gets loaded with 12345678h and that's what we want. So lets the algo. below: finit ;thats very very necesarry fild dword ptr [a] ;ST(0) = a, we work in dwords fild dword ptr [b] ;ST(0) = b, ST(1) = a fadd ;ST(0) = a+b fistp dword ptr [Var] ;Var=ST(0), store in a temp. variable fild dword ptr [c] ;ST(0) = c fild dword ptr [d] ;ST() = d, ST(1) = c fadd ;ST(0) = c+d fild dword ptr [Var] ;ST(0) = Var, ST(1) = c+d fmul ;ST(0) = Var*(c+d) fsin ;sin of ST(0) fcos ;cos of ST(0) ftan ;tan of ST(0) mov Var2, 4h ;Var2 = 4 fild dword ptr [Var2] ;ST(0) = Var2 = 4, ST(1) = tan(cos(sin(a*b+c*d)))/4 Pretty simple, and pretty long huh? Something you always got to remember is that, put .387 (or .487, .587). Like below: .386 .387 .data [...] .code [...] End This enables you to use 32 bit registers with FPU registers. If you have any problems, email me at: aks8586@yahoo.com -Surya/powerdryv -23rd June, 2004 I inspire....