
September 10th, 2003, 10:03 AM
#1
How to compute asin?
Hi,
I want to use faster trig functions instead of libc. I create a function to compute sin() using the FPU:
float fpu_sin(float radian)
{
float result;
_asm {
finit; // Initialise the FPU
fld radian; // Load the angle in radians
fsin; // Calculate the sin of that angle
fstp result; // Store the result
}
return result; // Return the result
}
BUT now, I want to create function like libc asin() to compute angle. FPU fasin does not seem to exist.
Any idea on how to compute my own asin() using the FPU?
Thanks,
Skynet.

September 10th, 2003, 10:39 AM
#2
Skynet... interesting, maybe we'll find out, in this resulting thread, why there is no fasin Intel instruction!
I have an idea though...
Write a very simple program that calls the library function. Put a break point on that line of code and run it. When you hit the break point switch to AssemblyCode view and trace one instruction at a time. You should be able to find out actually where it calculates the arcsin and extract that bit of code.
It would be interesting to know if you have got any timings:
Your fpu_sin(..) VS sin(...)
I just ask because, compiler optimisation is pretty quick nowadays and I'd use your version in my code if it wins!
Thanks
Rob.
Rob

Ohhhhh.... Old McDonald was dyslexic, E O I O EEEEEEEEEE.......

September 10th, 2003, 12:25 PM
#3
My fpu_sin() function is about 110% fast than the libc sin().
And because I use angle in degree between 1 to 360, I create an other function with precalculate sin table. The speed is very interesting.
Also, in my fpu_sin(), you dont have to call finit each time.
best regard,
Skynet.

September 10th, 2003, 12:54 PM
#4
I actually implemented this myself years ago and I can give you a clue.
The FPATAN function (partial arctangent) perfoms the basic calculation required for all inverse trigonometric functions. It is actually relatively simple to implement once you have made sure that the parameters are in the proper range for the FPATAN function.
This function calculates arctan(Y/X) and if you really know your trig (a good textbook will help) you should be able to derive the other inverse functions.
If I have time later (and you still haven't solved the problems) I'll see if I can find some source code. I think that most of it is on 5.25" floppies in a box somewhere  I did this work years ago.

September 10th, 2003, 01:11 PM
#5
Another clue:
asin(z) = atan( z / (sqrt((1z)*(1+z)) ) = atan(Y/X)
atan(Y/X) is implemented by the FPATAN function where Y=z and X=sqrt((1z)*(1+z)).
Now, if someone could only solve my problem.

September 10th, 2003, 01:36 PM
#6
hi guys,
I'm an assembly programmer:
here is the code:
Code:
fld st(0) ;copy sine value
fmul st,st(0) ;sin^2
fld1
fsubr ;1sin^2 = cos^2
fsqrt ;>cos
fpatan ;i.e. arctan(sin/cos) = arcsin
I will explain the math
simply this code just calculate the cosine of the sine from the equation : cos^2 X= 1  sin^2 X
and then gets the cosine by square root
and the calculate the arctan by dividing the sin value by the cos value and getting ther arctan all in one function.
If you want the whole documentation of the FPU instruction sets in the coprocessor I have just post a Reply
you must have a target to work for..............

September 11th, 2003, 02:20 PM
#7
Well,
I thanks you all for your answer. They were greatly appreciate
skynet.

September 11th, 2003, 07:39 PM
#8
Originally posted by rliq
Skynet... interesting, maybe we'll find out, in this resulting thread, why there is no fasin Intel instruction!
As one who has worked with the 8087 from the very beginning, I think I can answer that question. Here is a quote from one of the books on my shelf:
The elementary transcendental functions consist of two trigonometric instructions, two logarithmic instructions, and an exponential instruction. These instructions can be used to compute all the trigonometric, hyperbolic, logarithmic, and exponential functions within a restricted range.
The reason for the restricted range on some instructions is to save microcode space on the 8087 (microcode is a program stored within the 8087 that defines the algorithms used by the processor). It required some special techniques to fit just the restricted functions into the microcode. Thus, the 8087 performs what would otherwise be the most timeconsuming portion of the computation and leaves the task of argument reduction to the user’s program.
John F. Palmer & Stephen P. Morse
The 8087 Primer – Wiley Press – 1984
Palmer and Morse were Senior Staff Engineers at Intel Corp. and the principal architects of the 8087 and 8086 respectively.
You may remember that I stated that in implementing the FPATAN function that you are required to make sure that the Y & X arguments are in the proper range. This is indicated by the book excerpt above. However, I notice that amrsfmt’s code (which is essentially what I did except that I take the square root of X rather than squaring Y – the ratio is the same) does not do any of the range checking yet it still works! According to the documentation for the 8087 it shouldn’t  and in fact, I am certain that it didn't in the original code that I wrote. I suspect that this is because we are now using Pentium class coprocessors that have greatly improved the microcode. In other words, the FPATAN instruction now accepts a full range of arguments even though the 8087 didn't.

September 11th, 2003, 08:02 PM
#9
Just one other thing: I notice that you use the FINIT instruction in your function. If you are really trying to shave machine cycles, I would recommend not using FINIT in your functions. Experience has taught me that if the coprocessor code is written properly, FINIT only needs to be called once in your program.
I don't believe I have ever seen a case where the coprocessor needed to be initialized a second time except when the stack overflowed. And the stack will not overflow if your code is written properly.
Your sine function looks fine to me. It does one load and one pop  no problem.
As shown, amrsfmt's code has a problem with the stack. I assume that it's just because he extracted it out of context as an example. I would also guess that the stack top is initally loaded with the sine and then the result is popped (which is not shown in his code) and since two loads and no pops are shown, the stack will grow by two each time this function is called. You might be tempted to use FINIT to reset the stack each time, but I would recommend doing a load/pop match instead.

September 12th, 2003, 01:14 AM
#10
hi all,
As shown, amrsfmt's code has a problem with the stack. I assume that it's just because he extracted it out of context as an example. I would also guess that the stack top is initally loaded with the sine and then the result is popped (which is not shown in his code) and since two loads and no pops are shown, the stack will grow by two each time this function is called. You might be tempted to use FINIT to reset the stack each time, but I would recommend doing a load/pop match instead.
sorry 0xC0000005,
you are wrong I'm actually an assembly programmer.I programmed using FPU before. you might be right if you are working on the 8087 processor . but on 80386 and after that's wrong.
see this documentation:
FPATAN finds the partial tangent by calculating Z = ARCTAN(Y / X) where X is taken from ST and Y from ST(1). On the 8087/287, Y and X must be in the range 0 <= Y < X < infinity. On the 80387/486, there is no restriction on X and Y. X is popped from the stack and Z replaces Y in ST.
I think now all what you want to know is here, That's might helps
so there is no stack overflow. thank's
About finit , I think if you have used any math function(C math functions) in your progam. C compiler automatically put's finit and no need to write if not it's safer to write it.
amrsfmt
you must have a target to work for..............

September 12th, 2003, 06:46 AM
#11
Originally posted by amrsfmt
sorry 0xC0000005,
you are wrong I'm actually an assembly programmer.I programmed using FPU before. you might be right if you are working on the 8087 processor . but on 80386 and after that's wrong.
My apologies. After consulting the documentation for each function I see that there are two loads (fld, fld1) and two pops (fsubr, fpatan) in your code example. I simply forgot that fsubr and fpatan both popped the stack. It was an inexcusable error.
I did say that there was a problem with you code, but having seen too much code that disregards detail (and uses finit to fix it), the intent was really to advise was Skynet to pay attention to detail and code properly. Once again, sorry that it was at your expense.
By the way, I started programming the 8087 in machine language before it was even released to general public. My company bought prototypes directly from Intel, and lacking assembler or compiler support at that time, I wrote macros using the DB (define byte) type instructions that actually poked the machine opcodes into the correct locations. It's been a long time though, and I shouldn't have answered as I did without having all the facts.
And as I suspected, the fpatan function on newer coprocessors has been given a greatly expanded input range. If I ever do any more of this type of programming, I'll have to throw out all of my twenty year old books and get some new ones.

September 12th, 2003, 07:50 AM
#12
hi 0xC0000005,
thanks for your nice reply,
actually, I'm still student in the facualty of engineering.
I and one of my friends where trying to make a calculator like the casio fx82.
I want to have some resources on the 8087.
if you can help, I would be greatful.
thanks,
amrsfmt
you must have a target to work for..............
Posting Permissions
 You may not post new threads
 You may not post replies
 You may not post attachments
 You may not edit your posts

Forum Rules

Click Here to Expand Forum to Full Width
OnDemand Webinars (sponsored)
