1*25c28e83SPiotr Jasiukajtis /* 2*25c28e83SPiotr Jasiukajtis * CDDL HEADER START 3*25c28e83SPiotr Jasiukajtis * 4*25c28e83SPiotr Jasiukajtis * The contents of this file are subject to the terms of the 5*25c28e83SPiotr Jasiukajtis * Common Development and Distribution License (the "License"). 6*25c28e83SPiotr Jasiukajtis * You may not use this file except in compliance with the License. 7*25c28e83SPiotr Jasiukajtis * 8*25c28e83SPiotr Jasiukajtis * You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE 9*25c28e83SPiotr Jasiukajtis * or http://www.opensolaris.org/os/licensing. 10*25c28e83SPiotr Jasiukajtis * See the License for the specific language governing permissions 11*25c28e83SPiotr Jasiukajtis * and limitations under the License. 12*25c28e83SPiotr Jasiukajtis * 13*25c28e83SPiotr Jasiukajtis * When distributing Covered Code, include this CDDL HEADER in each 14*25c28e83SPiotr Jasiukajtis * file and include the License file at usr/src/OPENSOLARIS.LICENSE. 15*25c28e83SPiotr Jasiukajtis * If applicable, add the following below this CDDL HEADER, with the 16*25c28e83SPiotr Jasiukajtis * fields enclosed by brackets "[]" replaced with your own identifying 17*25c28e83SPiotr Jasiukajtis * information: Portions Copyright [yyyy] [name of copyright owner] 18*25c28e83SPiotr Jasiukajtis * 19*25c28e83SPiotr Jasiukajtis * CDDL HEADER END 20*25c28e83SPiotr Jasiukajtis */ 21*25c28e83SPiotr Jasiukajtis 22*25c28e83SPiotr Jasiukajtis /* 23*25c28e83SPiotr Jasiukajtis * Copyright 2011 Nexenta Systems, Inc. All rights reserved. 24*25c28e83SPiotr Jasiukajtis */ 25*25c28e83SPiotr Jasiukajtis /* 26*25c28e83SPiotr Jasiukajtis * Copyright 2006 Sun Microsystems, Inc. All rights reserved. 27*25c28e83SPiotr Jasiukajtis * Use is subject to license terms. 28*25c28e83SPiotr Jasiukajtis */ 29*25c28e83SPiotr Jasiukajtis 30*25c28e83SPiotr Jasiukajtis #if defined(ELFOBJ) 31*25c28e83SPiotr Jasiukajtis #pragma weak tgamma = __tgamma 32*25c28e83SPiotr Jasiukajtis #endif 33*25c28e83SPiotr Jasiukajtis 34*25c28e83SPiotr Jasiukajtis /* INDENT OFF */ 35*25c28e83SPiotr Jasiukajtis /* 36*25c28e83SPiotr Jasiukajtis * True gamma function 37*25c28e83SPiotr Jasiukajtis * double tgamma(double x) 38*25c28e83SPiotr Jasiukajtis * 39*25c28e83SPiotr Jasiukajtis * Error: 40*25c28e83SPiotr Jasiukajtis * ------ 41*25c28e83SPiotr Jasiukajtis * Less that one ulp for both positive and negative arguments. 42*25c28e83SPiotr Jasiukajtis * 43*25c28e83SPiotr Jasiukajtis * Algorithm: 44*25c28e83SPiotr Jasiukajtis * --------- 45*25c28e83SPiotr Jasiukajtis * A: For negative argument 46*25c28e83SPiotr Jasiukajtis * (1) gamma(-n or -inf) is NaN 47*25c28e83SPiotr Jasiukajtis * (2) Underflow Threshold 48*25c28e83SPiotr Jasiukajtis * (3) Reduction to gamma(1+x) 49*25c28e83SPiotr Jasiukajtis * B: For x between 1 and 2 50*25c28e83SPiotr Jasiukajtis * C: For x between 0 and 1 51*25c28e83SPiotr Jasiukajtis * D: For x between 2 and 8 52*25c28e83SPiotr Jasiukajtis * E: Overflow thresold {see over.c} 53*25c28e83SPiotr Jasiukajtis * F: For overflow_threshold >= x >= 8 54*25c28e83SPiotr Jasiukajtis * 55*25c28e83SPiotr Jasiukajtis * Implementation details 56*25c28e83SPiotr Jasiukajtis * ----------------------- 57*25c28e83SPiotr Jasiukajtis * -pi 58*25c28e83SPiotr Jasiukajtis * (A) For negative argument, use gamma(-x) = ------------------------. 59*25c28e83SPiotr Jasiukajtis * (sin(pi*x)*gamma(1+x)) 60*25c28e83SPiotr Jasiukajtis * 61*25c28e83SPiotr Jasiukajtis * (1) gamma(-n or -inf) is NaN with invalid signal by SUSv3 spec. 62*25c28e83SPiotr Jasiukajtis * (Ideally, gamma(-n) = 1/sinpi(n) = (-1)**(n+1) * inf.) 63*25c28e83SPiotr Jasiukajtis * 64*25c28e83SPiotr Jasiukajtis * (2) Underflow Threshold. For each precision, there is a value T 65*25c28e83SPiotr Jasiukajtis * such that when x>T and when x is not an integer, gamma(-x) will 66*25c28e83SPiotr Jasiukajtis * always underflow. A table of the underflow threshold value is given 67*25c28e83SPiotr Jasiukajtis * below. For proof, see file "under.c". 68*25c28e83SPiotr Jasiukajtis * 69*25c28e83SPiotr Jasiukajtis * Precision underflow threshold T = 70*25c28e83SPiotr Jasiukajtis * ---------------------------------------------------------------------- 71*25c28e83SPiotr Jasiukajtis * single 41.000041962 = 41 + 11 ULP 72*25c28e83SPiotr Jasiukajtis * (machine format) 4224000B 73*25c28e83SPiotr Jasiukajtis * double 183.000000000000312639 = 183 + 11 ULP 74*25c28e83SPiotr Jasiukajtis * (machine format) 4066E000 0000000B 75*25c28e83SPiotr Jasiukajtis * quad 1774.0000000000000000000000000000017749370 = 1774 + 9 ULP 76*25c28e83SPiotr Jasiukajtis * (machine format) 4009BB80000000000000000000000009 77*25c28e83SPiotr Jasiukajtis * ---------------------------------------------------------------------- 78*25c28e83SPiotr Jasiukajtis * 79*25c28e83SPiotr Jasiukajtis * (3) Reduction to gamma(1+x). 80*25c28e83SPiotr Jasiukajtis * Because of (1) and (2), we need only consider non-integral x 81*25c28e83SPiotr Jasiukajtis * such that 0<x<T. Let k = [x] and z = x-[x]. Define 82*25c28e83SPiotr Jasiukajtis * sin(x*pi) cos(x*pi) 83*25c28e83SPiotr Jasiukajtis * kpsin(x) = --------- and kpcos(x) = --------- . Then 84*25c28e83SPiotr Jasiukajtis * pi pi 85*25c28e83SPiotr Jasiukajtis * 1 86*25c28e83SPiotr Jasiukajtis * gamma(-x) = --------------------. 87*25c28e83SPiotr Jasiukajtis * -kpsin(x)*gamma(1+x) 88*25c28e83SPiotr Jasiukajtis * Since x = k+z, 89*25c28e83SPiotr Jasiukajtis * k+1 90*25c28e83SPiotr Jasiukajtis * -sin(x*pi) = -sin(k*pi+z*pi) = (-1) *sin(z*pi), 91*25c28e83SPiotr Jasiukajtis * k+1 92*25c28e83SPiotr Jasiukajtis * we have -kpsin(x) = (-1) * kpsin(z). We can further 93*25c28e83SPiotr Jasiukajtis * reduce z to t by 94*25c28e83SPiotr Jasiukajtis * (I) t = z when 0.00000 <= z < 0.31830... 95*25c28e83SPiotr Jasiukajtis * (II) t = 0.5-z when 0.31830... <= z < 0.681690... 96*25c28e83SPiotr Jasiukajtis * (III) t = 1-z when 0.681690... <= z < 1.00000 97*25c28e83SPiotr Jasiukajtis * and correspondingly 98*25c28e83SPiotr Jasiukajtis * (I) kpsin(z) = kpsin(t) ... 0<= z < 0.3184 99*25c28e83SPiotr Jasiukajtis * (II) kpsin(z) = kpcos(t) ... |t| < 0.182 100*25c28e83SPiotr Jasiukajtis * (III) kpsin(z) = kpsin(t) ... 0<= t < 0.3184 101*25c28e83SPiotr Jasiukajtis * 102*25c28e83SPiotr Jasiukajtis * Using a special Remez algorithm, we obtain the following polynomial 103*25c28e83SPiotr Jasiukajtis * approximation for kpsin(t) for 0<=t<0.3184: 104*25c28e83SPiotr Jasiukajtis * 105*25c28e83SPiotr Jasiukajtis * Computation note: in simulating higher precision arithmetic, kcpsin 106*25c28e83SPiotr Jasiukajtis * return head = t and tail = ks[0]*t^3 + (...) to maintain extra bits. 107*25c28e83SPiotr Jasiukajtis * 108*25c28e83SPiotr Jasiukajtis * Quad precision, remez error <= 2**(-129.74) 109*25c28e83SPiotr Jasiukajtis * 3 5 27 110*25c28e83SPiotr Jasiukajtis * kpsin(t) = t + ks[0] * t + ks[1] * t + ... + ks[12] * t 111*25c28e83SPiotr Jasiukajtis * 112*25c28e83SPiotr Jasiukajtis * ks[ 0] = -1.64493406684822643647241516664602518705158902870e+0000 113*25c28e83SPiotr Jasiukajtis * ks[ 1] = 8.11742425283353643637002772405874238094995726160e-0001 114*25c28e83SPiotr Jasiukajtis * ks[ 2] = -1.90751824122084213696472111835337366232282723933e-0001 115*25c28e83SPiotr Jasiukajtis * ks[ 3] = 2.61478478176548005046532613563241288115395517084e-0002 116*25c28e83SPiotr Jasiukajtis * ks[ 4] = -2.34608103545582363750893072647117829448016479971e-0003 117*25c28e83SPiotr Jasiukajtis * ks[ 5] = 1.48428793031071003684606647212534027556262040158e-0004 118*25c28e83SPiotr Jasiukajtis * ks[ 6] = -6.97587366165638046518462722252768122615952898698e-0006 119*25c28e83SPiotr Jasiukajtis * ks[ 7] = 2.53121740413702536928659271747187500934840057929e-0007 120*25c28e83SPiotr Jasiukajtis * ks[ 8] = -7.30471182221385990397683641695766121301933621956e-0009 121*25c28e83SPiotr Jasiukajtis * ks[ 9] = 1.71653847451163495739958249695549313987973589884e-0010 122*25c28e83SPiotr Jasiukajtis * ks[10] = -3.34813314714560776122245796929054813458341420565e-0012 123*25c28e83SPiotr Jasiukajtis * ks[11] = 5.50724992262622033449487808306969135431411753047e-0014 124*25c28e83SPiotr Jasiukajtis * ks[12] = -7.67678132753577998601234393215802221104236979928e-0016 125*25c28e83SPiotr Jasiukajtis * 126*25c28e83SPiotr Jasiukajtis * Double precision, Remez error <= 2**(-62.9) 127*25c28e83SPiotr Jasiukajtis * 3 5 15 128*25c28e83SPiotr Jasiukajtis * kpsin(t) = t + ks[0] * t + ks[1] * t + ... + ks[6] * t 129*25c28e83SPiotr Jasiukajtis * 130*25c28e83SPiotr Jasiukajtis * ks[0] = -1.644934066848226406065691 (0x3ffa51a6 625307d3) 131*25c28e83SPiotr Jasiukajtis * ks[1] = 8.11742425283341655883668741874008920850698590621e-0001 132*25c28e83SPiotr Jasiukajtis * ks[2] = -1.90751824120862873825597279118304943994042258291e-0001 133*25c28e83SPiotr Jasiukajtis * ks[3] = 2.61478477632554278317289628332654539353521911570e-0002 134*25c28e83SPiotr Jasiukajtis * ks[4] = -2.34607978510202710377617190278735525354347705866e-0003 135*25c28e83SPiotr Jasiukajtis * ks[5] = 1.48413292290051695897242899977121846763824221705e-0004 136*25c28e83SPiotr Jasiukajtis * ks[6] = -6.87730769637543488108688726777687262485357072242e-0006 137*25c28e83SPiotr Jasiukajtis * 138*25c28e83SPiotr Jasiukajtis * Single precision, Remez error <= 2**(-34.09) 139*25c28e83SPiotr Jasiukajtis * 3 5 9 140*25c28e83SPiotr Jasiukajtis * kpsin(t) = t + ks[0] * t + ks[1] * t + ... + ks[3] * t 141*25c28e83SPiotr Jasiukajtis * 142*25c28e83SPiotr Jasiukajtis * ks[0] = -1.64493404985645811354476665052005342839447790544e+0000 143*25c28e83SPiotr Jasiukajtis * ks[1] = 8.11740794458351064092797249069438269367389272270e-0001 144*25c28e83SPiotr Jasiukajtis * ks[2] = -1.90703144603551216933075809162889536878854055202e-0001 145*25c28e83SPiotr Jasiukajtis * ks[3] = 2.55742333994264563281155312271481108635575331201e-0002 146*25c28e83SPiotr Jasiukajtis * 147*25c28e83SPiotr Jasiukajtis * Computation note: in simulating higher precision arithmetic, kcpsin 148*25c28e83SPiotr Jasiukajtis * return head = t and tail = kc[0]*t^3 + (...) to maintain extra bits 149*25c28e83SPiotr Jasiukajtis * precision. 150*25c28e83SPiotr Jasiukajtis * 151*25c28e83SPiotr Jasiukajtis * And for kpcos(t) for |t|< 0.183: 152*25c28e83SPiotr Jasiukajtis * 153*25c28e83SPiotr Jasiukajtis * Quad precision, remez <= 2**(-122.48) 154*25c28e83SPiotr Jasiukajtis * 2 4 22 155*25c28e83SPiotr Jasiukajtis * kpcos(t) = 1/pi + pi/2 * t + kc[2] * t + ... + kc[11] * t 156*25c28e83SPiotr Jasiukajtis * 157*25c28e83SPiotr Jasiukajtis * kc[2] = 1.29192819501249250731151312779548918765320728489e+0000 158*25c28e83SPiotr Jasiukajtis * kc[3] = -4.25027339979557573976029596929319207009444090366e-0001 159*25c28e83SPiotr Jasiukajtis * kc[4] = 7.49080661650990096109672954618317623888421628613e-0002 160*25c28e83SPiotr Jasiukajtis * kc[5] = -8.21458866111282287985539464173976555436050215120e-0003 161*25c28e83SPiotr Jasiukajtis * kc[6] = 6.14202578809529228503205255165761204750211603402e-0004 162*25c28e83SPiotr Jasiukajtis * kc[7] = -3.33073432691149607007217330302595267179545908740e-0005 163*25c28e83SPiotr Jasiukajtis * kc[8] = 1.36970959047832085796809745461530865597993680204e-0006 164*25c28e83SPiotr Jasiukajtis * kc[9] = -4.41780774262583514450246512727201806217271097336e-0008 165*25c28e83SPiotr Jasiukajtis * kc[10]= 1.14741409212381858820016567664488123478660705759e-0009 166*25c28e83SPiotr Jasiukajtis * kc[11]= -2.44261236114707374558437500654381006300502749632e-0011 167*25c28e83SPiotr Jasiukajtis * 168*25c28e83SPiotr Jasiukajtis * Double precision, remez < 2**(61.91) 169*25c28e83SPiotr Jasiukajtis * 2 4 12 170*25c28e83SPiotr Jasiukajtis * kpcos(t) = 1/pi + pi/2 *t + kc[2] * t + ... + kc[6] * t 171*25c28e83SPiotr Jasiukajtis * 172*25c28e83SPiotr Jasiukajtis * kc[2] = 1.29192819501230224953283586722575766189551966008e+0000 173*25c28e83SPiotr Jasiukajtis * kc[3] = -4.25027339940149518500158850753393173519732149213e-0001 174*25c28e83SPiotr Jasiukajtis * kc[4] = 7.49080625187015312373925142219429422375556727752e-0002 175*25c28e83SPiotr Jasiukajtis * kc[5] = -8.21442040906099210866977352284054849051348692715e-0003 176*25c28e83SPiotr Jasiukajtis * kc[6] = 6.10411356829515414575566564733632532333904115968e-0004 177*25c28e83SPiotr Jasiukajtis * 178*25c28e83SPiotr Jasiukajtis * Single precision, remez < 2**(-30.13) 179*25c28e83SPiotr Jasiukajtis * 2 6 180*25c28e83SPiotr Jasiukajtis * kpcos(t) = kc[0] + kc[1] * t + ... + kc[3] * t 181*25c28e83SPiotr Jasiukajtis * 182*25c28e83SPiotr Jasiukajtis * kc[0] = 3.18309886183790671537767526745028724068919291480e-0001 183*25c28e83SPiotr Jasiukajtis * kc[1] = -1.57079581447762568199467875065854538626594937791e+0000 184*25c28e83SPiotr Jasiukajtis * kc[2] = 1.29183528092558692844073004029568674027807393862e+0000 185*25c28e83SPiotr Jasiukajtis * kc[3] = -4.20232949771307685981015914425195471602739075537e-0001 186*25c28e83SPiotr Jasiukajtis * 187*25c28e83SPiotr Jasiukajtis * Computation note: in simulating higher precision arithmetic, kcpcos 188*25c28e83SPiotr Jasiukajtis * return head = 1/pi chopped, and tail = pi/2 *t^2 + (tail part of 1/pi 189*25c28e83SPiotr Jasiukajtis * + ...) to maintain extra bits precision. In particular, pi/2 * t^2 190*25c28e83SPiotr Jasiukajtis * is calculated with great care. 191*25c28e83SPiotr Jasiukajtis * 192*25c28e83SPiotr Jasiukajtis * Thus, the computation of gamma(-x), x>0, is: 193*25c28e83SPiotr Jasiukajtis * Let k = int(x), z = x-k. 194*25c28e83SPiotr Jasiukajtis * For z in (I) 195*25c28e83SPiotr Jasiukajtis * k+1 196*25c28e83SPiotr Jasiukajtis * (-1) 197*25c28e83SPiotr Jasiukajtis * gamma(-x) = ------------------- ; 198*25c28e83SPiotr Jasiukajtis * kpsin(z)*gamma(1+x) 199*25c28e83SPiotr Jasiukajtis * 200*25c28e83SPiotr Jasiukajtis * otherwise, for z in (II), 201*25c28e83SPiotr Jasiukajtis * k+1 202*25c28e83SPiotr Jasiukajtis * (-1) 203*25c28e83SPiotr Jasiukajtis * gamma(-x) = ----------------------- ; 204*25c28e83SPiotr Jasiukajtis * kpcos(0.5-z)*gamma(1+x) 205*25c28e83SPiotr Jasiukajtis * 206*25c28e83SPiotr Jasiukajtis * otherwise, for z in (III), 207*25c28e83SPiotr Jasiukajtis * k+1 208*25c28e83SPiotr Jasiukajtis * (-1) 209*25c28e83SPiotr Jasiukajtis * gamma(-x) = --------------------- . 210*25c28e83SPiotr Jasiukajtis * kpsin(1-z)*gamma(1+x) 211*25c28e83SPiotr Jasiukajtis * 212*25c28e83SPiotr Jasiukajtis * Thus, the computation of gamma(-x) reduced to the computation of 213*25c28e83SPiotr Jasiukajtis * gamma(1+x) and kpsin(), kpcos(). 214*25c28e83SPiotr Jasiukajtis * 215*25c28e83SPiotr Jasiukajtis * (B) For x between 1 and 2. We break [1,2] into three parts: 216*25c28e83SPiotr Jasiukajtis * GT1 = [1.0000, 1.2845] 217*25c28e83SPiotr Jasiukajtis * GT2 = [1.2844, 1.6374] 218*25c28e83SPiotr Jasiukajtis * GT3 = [1.6373, 2.0000] 219*25c28e83SPiotr Jasiukajtis * 220*25c28e83SPiotr Jasiukajtis * For x in GTi, i=1,2,3, let 221*25c28e83SPiotr Jasiukajtis * z1 = 1.134861805732790769689793935774652917006 222*25c28e83SPiotr Jasiukajtis * gz1 = gamma(z1) = 0.9382046279096824494097535615803269576988 223*25c28e83SPiotr Jasiukajtis * tz1 = gamma'(z1) = -0.3517214357852935791015625000000000000000 224*25c28e83SPiotr Jasiukajtis * 225*25c28e83SPiotr Jasiukajtis * z2 = 1.461632144968362341262659542325721328468e+0000 226*25c28e83SPiotr Jasiukajtis * gz2 = gamma(z2) = 0.8856031944108887002788159005825887332080 227*25c28e83SPiotr Jasiukajtis * tz2 = gamma'(z2) = 0.00 228*25c28e83SPiotr Jasiukajtis * 229*25c28e83SPiotr Jasiukajtis * z3 = 1.819773101100500601787868704921606996312e+0000 230*25c28e83SPiotr Jasiukajtis * gz3 = gamma(z3) = 0.9367814114636523216188468970808378497426 231*25c28e83SPiotr Jasiukajtis * tz3 = gamma'(z3) = 0.2805306315422058105468750000000000000000 232*25c28e83SPiotr Jasiukajtis * 233*25c28e83SPiotr Jasiukajtis * and 234*25c28e83SPiotr Jasiukajtis * y = x-zi ... for extra precision, write y = y.h + y.l 235*25c28e83SPiotr Jasiukajtis * Then 236*25c28e83SPiotr Jasiukajtis * gamma(x) = gzi + tzi*(y.h+y.l) + y*y*Ri(y), 237*25c28e83SPiotr Jasiukajtis * = gzi.h + (tzi*y.h + ((tzi*y.l+gzi.l) + y*y*Ri(y))) 238*25c28e83SPiotr Jasiukajtis * = gy.h + gy.l 239*25c28e83SPiotr Jasiukajtis * where 240*25c28e83SPiotr Jasiukajtis * (I) For double precision 241*25c28e83SPiotr Jasiukajtis * 242*25c28e83SPiotr Jasiukajtis * Ri(y) = Pi(y)/Qi(y), i=1,2,3; 243*25c28e83SPiotr Jasiukajtis * 244*25c28e83SPiotr Jasiukajtis * P1(y) = p1[0] + p1[1]*y + ... + p1[4]*y^4 245*25c28e83SPiotr Jasiukajtis * Q1(y) = q1[0] + q1[1]*y + ... + q1[5]*y^5 246*25c28e83SPiotr Jasiukajtis * 247*25c28e83SPiotr Jasiukajtis * P2(y) = p2[0] + p2[1]*y + ... + p2[3]*y^3 248*25c28e83SPiotr Jasiukajtis * Q2(y) = q2[0] + q2[1]*y + ... + q2[6]*y^6 249*25c28e83SPiotr Jasiukajtis * 250*25c28e83SPiotr Jasiukajtis * P3(y) = p3[0] + p3[1]*y + ... + p3[4]*y^4 251*25c28e83SPiotr Jasiukajtis * Q3(y) = q3[0] + q3[1]*y + ... + q3[5]*y^5 252*25c28e83SPiotr Jasiukajtis * 253*25c28e83SPiotr Jasiukajtis * Remez precision of Ri(y): 254*25c28e83SPiotr Jasiukajtis * |gamma(x)-(gzi+tzi*y) - y*y*Ri(y)| <= 2**-62.3 ... for i = 1 255*25c28e83SPiotr Jasiukajtis * <= 2**-59.4 ... for i = 2 256*25c28e83SPiotr Jasiukajtis * <= 2**-62.1 ... for i = 3 257*25c28e83SPiotr Jasiukajtis * 258*25c28e83SPiotr Jasiukajtis * (II) For quad precision 259*25c28e83SPiotr Jasiukajtis * 260*25c28e83SPiotr Jasiukajtis * Ri(y) = Pi(y)/Qi(y), i=1,2,3; 261*25c28e83SPiotr Jasiukajtis * 262*25c28e83SPiotr Jasiukajtis * P1(y) = p1[0] + p1[1]*y + ... + p1[9]*y^9 263*25c28e83SPiotr Jasiukajtis * Q1(y) = q1[0] + q1[1]*y + ... + q1[8]*y^8 264*25c28e83SPiotr Jasiukajtis * 265*25c28e83SPiotr Jasiukajtis * P2(y) = p2[0] + p2[1]*y + ... + p2[9]*y^9 266*25c28e83SPiotr Jasiukajtis * Q2(y) = q2[0] + q2[1]*y + ... + q2[9]*y^9 267*25c28e83SPiotr Jasiukajtis * 268*25c28e83SPiotr Jasiukajtis * P3(y) = p3[0] + p3[1]*y + ... + p3[9]*y^9 269*25c28e83SPiotr Jasiukajtis * Q3(y) = q3[0] + q3[1]*y + ... + q3[9]*y^9 270*25c28e83SPiotr Jasiukajtis * 271*25c28e83SPiotr Jasiukajtis * Remez precision of Ri(y): 272*25c28e83SPiotr Jasiukajtis * |gamma(x)-(gzi+tzi*y) - y*y*Ri(y)| <= 2**-118.2 ... for i = 1 273*25c28e83SPiotr Jasiukajtis * <= 2**-126.8 ... for i = 2 274*25c28e83SPiotr Jasiukajtis * <= 2**-119.5 ... for i = 3 275*25c28e83SPiotr Jasiukajtis * 276*25c28e83SPiotr Jasiukajtis * (III) For single precision 277*25c28e83SPiotr Jasiukajtis * 278*25c28e83SPiotr Jasiukajtis * Ri(y) = Pi(y), i=1,2,3; 279*25c28e83SPiotr Jasiukajtis * 280*25c28e83SPiotr Jasiukajtis * P1(y) = p1[0] + p1[1]*y + ... + p1[5]*y^5 281*25c28e83SPiotr Jasiukajtis * 282*25c28e83SPiotr Jasiukajtis * P2(y) = p2[0] + p2[1]*y + ... + p2[5]*y^5 283*25c28e83SPiotr Jasiukajtis * 284*25c28e83SPiotr Jasiukajtis * P3(y) = p3[0] + p3[1]*y + ... + p3[4]*y^4 285*25c28e83SPiotr Jasiukajtis * 286*25c28e83SPiotr Jasiukajtis * Remez precision of Ri(y): 287*25c28e83SPiotr Jasiukajtis * |gamma(x)-(gzi+tzi*y) - y*y*Ri(y)| <= 2**-30.8 ... for i = 1 288*25c28e83SPiotr Jasiukajtis * <= 2**-31.6 ... for i = 2 289*25c28e83SPiotr Jasiukajtis * <= 2**-29.5 ... for i = 3 290*25c28e83SPiotr Jasiukajtis * 291*25c28e83SPiotr Jasiukajtis * Notes. (1) GTi and zi are choosen to balance the interval width and 292*25c28e83SPiotr Jasiukajtis * minimize the distant between gamma(x) and the tangent line at 293*25c28e83SPiotr Jasiukajtis * zi. In particular, we have 294*25c28e83SPiotr Jasiukajtis * |gamma(x)-(gzi+tzi*(x-zi))| <= 0.01436... for x in [1,z2] 295*25c28e83SPiotr Jasiukajtis * <= 0.01265... for x in [z2,2] 296*25c28e83SPiotr Jasiukajtis * 297*25c28e83SPiotr Jasiukajtis * (2) zi are slightly adjusted so that tzi=gamma'(zi) is very 298*25c28e83SPiotr Jasiukajtis * close to a single precision value. 299*25c28e83SPiotr Jasiukajtis * 300*25c28e83SPiotr Jasiukajtis * Coefficents: Single precision 301*25c28e83SPiotr Jasiukajtis * i= 1: 302*25c28e83SPiotr Jasiukajtis * P1[0] = 7.09087253435088360271451613398019280077561279443e-0001 303*25c28e83SPiotr Jasiukajtis * P1[1] = -5.17229560788652108545141978238701790105241761089e-0001 304*25c28e83SPiotr Jasiukajtis * P1[2] = 5.23403394528150789405825222323770647162337764327e-0001 305*25c28e83SPiotr Jasiukajtis * P1[3] = -4.54586308717075010784041566069480411732634814899e-0001 306*25c28e83SPiotr Jasiukajtis * P1[4] = 4.20596490915239085459964590559256913498190955233e-0001 307*25c28e83SPiotr Jasiukajtis * P1[5] = -3.57307589712377520978332185838241458642142185789e-0001 308*25c28e83SPiotr Jasiukajtis * 309*25c28e83SPiotr Jasiukajtis * i = 2: 310*25c28e83SPiotr Jasiukajtis * p2[0] = 4.28486983980295198166056119223984284434264344578e-0001 311*25c28e83SPiotr Jasiukajtis * p2[1] = -1.30704539487709138528680121627899735386650103914e-0001 312*25c28e83SPiotr Jasiukajtis * p2[2] = 1.60856285038051955072861219352655851542955430871e-0001 313*25c28e83SPiotr Jasiukajtis * p2[3] = -9.22285161346010583774458802067371182158937943507e-0002 314*25c28e83SPiotr Jasiukajtis * p2[4] = 7.19240511767225260740890292605070595560626179357e-0002 315*25c28e83SPiotr Jasiukajtis * p2[5] = -4.88158265593355093703112238534484636193260459574e-0002 316*25c28e83SPiotr Jasiukajtis * 317*25c28e83SPiotr Jasiukajtis * i = 3 318*25c28e83SPiotr Jasiukajtis * p3[0] = 3.82409531118807759081121479786092134814808872880e-0001 319*25c28e83SPiotr Jasiukajtis * p3[1] = 2.65309888180188647956400403013495759365167853426e-0002 320*25c28e83SPiotr Jasiukajtis * p3[2] = 8.06815109775079171923561169415370309376296739835e-0002 321*25c28e83SPiotr Jasiukajtis * p3[3] = -1.54821591666137613928840890835174351674007764799e-0002 322*25c28e83SPiotr Jasiukajtis * p3[4] = 1.76308239242717268530498313416899188157165183405e-0002 323*25c28e83SPiotr Jasiukajtis * 324*25c28e83SPiotr Jasiukajtis * Coefficents: Double precision 325*25c28e83SPiotr Jasiukajtis * i = 1: 326*25c28e83SPiotr Jasiukajtis * p1[0] = 0.70908683619977797008004927192814648151397705078125000 327*25c28e83SPiotr Jasiukajtis * p1[1] = 1.71987061393048558089579513384356441668351720061e-0001 328*25c28e83SPiotr Jasiukajtis * p1[2] = -3.19273345791990970293320316122813960527705450671e-0002 329*25c28e83SPiotr Jasiukajtis * p1[3] = 8.36172645419110036267169600390549973563534476989e-0003 330*25c28e83SPiotr Jasiukajtis * p1[4] = 1.13745336648572838333152213474277971244629758101e-0003 331*25c28e83SPiotr Jasiukajtis * q1[0] = 1.0 332*25c28e83SPiotr Jasiukajtis * q1[1] = 9.71980217826032937526460731778472389791321968082e-0001 333*25c28e83SPiotr Jasiukajtis * q1[2] = -7.43576743326756176594084137256042653497087666030e-0002 334*25c28e83SPiotr Jasiukajtis * q1[3] = -1.19345944932265559769719470515102012246995255372e-0001 335*25c28e83SPiotr Jasiukajtis * q1[4] = 1.59913445751425002620935120470781382215050284762e-0002 336*25c28e83SPiotr Jasiukajtis * q1[5] = 1.12601136853374984566572691306402321911547550783e-0003 337*25c28e83SPiotr Jasiukajtis * i = 2: 338*25c28e83SPiotr Jasiukajtis * p2[0] = 0.42848681585558601181418225678498856723308563232421875 339*25c28e83SPiotr Jasiukajtis * p2[1] = 6.53596762668970816023718845105667418483122103629e-0002 340*25c28e83SPiotr Jasiukajtis * p2[2] = -6.97280829631212931321050770925128264272768936731e-0003 341*25c28e83SPiotr Jasiukajtis * p2[3] = 6.46342359021981718947208605674813260166116632899e-0003 342*25c28e83SPiotr Jasiukajtis * q2[0] = 1.0 343*25c28e83SPiotr Jasiukajtis * q2[1] = 4.57572620560506047062553957454062012327519313936e-0001 344*25c28e83SPiotr Jasiukajtis * q2[2] = -2.52182594886075452859655003407796103083422572036e-0001 345*25c28e83SPiotr Jasiukajtis * q2[3] = -1.82970945407778594681348166040103197178711552827e-0002 346*25c28e83SPiotr Jasiukajtis * q2[4] = 2.43574726993169566475227642128830141304953840502e-0002 347*25c28e83SPiotr Jasiukajtis * q2[5] = -5.20390406466942525358645957564897411258667085501e-0003 348*25c28e83SPiotr Jasiukajtis * q2[6] = 4.79520251383279837635552431988023256031951133885e-0004 349*25c28e83SPiotr Jasiukajtis * i = 3: 350*25c28e83SPiotr Jasiukajtis * p3[0] = 0.382409479734567459008331979930517263710498809814453125 351*25c28e83SPiotr Jasiukajtis * p3[1] = 1.42876048697668161599069814043449301572928034140e-0001 352*25c28e83SPiotr Jasiukajtis * p3[2] = 3.42157571052250536817923866013561760785748899071e-0003 353*25c28e83SPiotr Jasiukajtis * p3[3] = -5.01542621710067521405087887856991700987709272937e-0004 354*25c28e83SPiotr Jasiukajtis * p3[4] = 8.89285814866740910123834688163838287618332122670e-0004 355*25c28e83SPiotr Jasiukajtis * q3[0] = 1.0 356*25c28e83SPiotr Jasiukajtis * q3[1] = 3.04253086629444201002215640948957897906299633168e-0001 357*25c28e83SPiotr Jasiukajtis * q3[2] = -2.23162407379999477282555672834881213873185520006e-0001 358*25c28e83SPiotr Jasiukajtis * q3[3] = -1.05060867741952065921809811933670131427552903636e-0002 359*25c28e83SPiotr Jasiukajtis * q3[4] = 1.70511763916186982473301861980856352005926669320e-0002 360*25c28e83SPiotr Jasiukajtis * q3[5] = -2.12950201683609187927899416700094630764182477464e-0003 361*25c28e83SPiotr Jasiukajtis * 362*25c28e83SPiotr Jasiukajtis * Note that all pi0 are exact in double, which is obtained by a 363*25c28e83SPiotr Jasiukajtis * special Remez Algorithm. 364*25c28e83SPiotr Jasiukajtis * 365*25c28e83SPiotr Jasiukajtis * Coefficents: Quad precision 366*25c28e83SPiotr Jasiukajtis * i = 1: 367*25c28e83SPiotr Jasiukajtis * p1[0] = 0.709086836199777919037185741507610124611513720557 368*25c28e83SPiotr Jasiukajtis * p1[1] = 4.45754781206489035827915969367354835667391606951e-0001 369*25c28e83SPiotr Jasiukajtis * p1[2] = 3.21049298735832382311662273882632210062918153852e-0002 370*25c28e83SPiotr Jasiukajtis * p1[3] = -5.71296796342106617651765245858289197369688864350e-0003 371*25c28e83SPiotr Jasiukajtis * p1[4] = 6.04666892891998977081619174969855831606965352773e-0003 372*25c28e83SPiotr Jasiukajtis * p1[5] = 8.99106186996888711939627812174765258822658645168e-0004 373*25c28e83SPiotr Jasiukajtis * p1[6] = -6.96496846144407741431207008527018441810175568949e-0005 374*25c28e83SPiotr Jasiukajtis * p1[7] = 1.52597046118984020814225409300131445070213882429e-0005 375*25c28e83SPiotr Jasiukajtis * p1[8] = 5.68521076168495673844711465407432189190681541547e-0007 376*25c28e83SPiotr Jasiukajtis * p1[9] = 3.30749673519634895220582062520286565610418952979e-0008 377*25c28e83SPiotr Jasiukajtis * q1[0] = 1.0+0000 378*25c28e83SPiotr Jasiukajtis * q1[1] = 1.35806511721671070408570853537257079579490650668e+0000 379*25c28e83SPiotr Jasiukajtis * q1[2] = 2.97567810153429553405327140096063086994072952961e-0001 380*25c28e83SPiotr Jasiukajtis * q1[3] = -1.52956835982588571502954372821681851681118097870e-0001 381*25c28e83SPiotr Jasiukajtis * q1[4] = -2.88248519561420109768781615289082053597954521218e-0002 382*25c28e83SPiotr Jasiukajtis * q1[5] = 1.03475311719937405219789948456313936302378395955e-0002 383*25c28e83SPiotr Jasiukajtis * q1[6] = 4.12310203243891222368965360124391297374822742313e-0004 384*25c28e83SPiotr Jasiukajtis * q1[7] = -3.12653708152290867248931925120380729518332507388e-0004 385*25c28e83SPiotr Jasiukajtis * q1[8] = 2.36672170850409745237358105667757760527014332458e-0005 386*25c28e83SPiotr Jasiukajtis * 387*25c28e83SPiotr Jasiukajtis * i = 2: 388*25c28e83SPiotr Jasiukajtis * p2[0] = 0.428486815855585429730209907810650616737756697477 389*25c28e83SPiotr Jasiukajtis * p2[1] = 2.63622124067885222919192651151581541943362617352e-0001 390*25c28e83SPiotr Jasiukajtis * p2[2] = 3.85520683670028865731877276741390421744971446855e-0002 391*25c28e83SPiotr Jasiukajtis * p2[3] = 3.05065978278128549958897133190295325258023525862e-0003 392*25c28e83SPiotr Jasiukajtis * p2[4] = 2.48232934951723128892080415054084339152450445081e-0003 393*25c28e83SPiotr Jasiukajtis * p2[5] = 3.67092777065632360693313762221411547741550105407e-0004 394*25c28e83SPiotr Jasiukajtis * p2[6] = 3.81228045616085789674530902563145250532194518946e-0006 395*25c28e83SPiotr Jasiukajtis * p2[7] = 4.61677225867087554059531455133839175822537617677e-0006 396*25c28e83SPiotr Jasiukajtis * p2[8] = 2.18209052385703200438239200991201916609364872993e-0007 397*25c28e83SPiotr Jasiukajtis * p2[9] = 1.00490538985245846460006244065624754421022542454e-0008 398*25c28e83SPiotr Jasiukajtis * q2[0] = 1.0 399*25c28e83SPiotr Jasiukajtis * q2[1] = 9.20276350207639290567783725273128544224570775056e-0001 400*25c28e83SPiotr Jasiukajtis * q2[2] = -4.79533683654165107448020515733883781138947771495e-0003 401*25c28e83SPiotr Jasiukajtis * q2[3] = -1.24538337585899300494444600248687901947684291683e-0001 402*25c28e83SPiotr Jasiukajtis * q2[4] = 4.49866050763472358547524708431719114204535491412e-0003 403*25c28e83SPiotr Jasiukajtis * q2[5] = 7.20715455697920560621638325356292640604078591907e-0003 404*25c28e83SPiotr Jasiukajtis * q2[6] = -8.68513169029126780280798337091982780598228096116e-0004 405*25c28e83SPiotr Jasiukajtis * q2[7] = -1.25104431629401181525027098222745544809974229874e-0004 406*25c28e83SPiotr Jasiukajtis * q2[8] = 3.10558344839000038489191304550998047521253437464e-0005 407*25c28e83SPiotr Jasiukajtis * q2[9] = -1.76829227852852176018537139573609433652506765712e-0006 408*25c28e83SPiotr Jasiukajtis * 409*25c28e83SPiotr Jasiukajtis * i = 3 410*25c28e83SPiotr Jasiukajtis * p3[0] = 0.3824094797345675048502747661075355640070439388902 411*25c28e83SPiotr Jasiukajtis * p3[1] = 3.42198093076618495415854906335908427159833377774e-0001 412*25c28e83SPiotr Jasiukajtis * p3[2] = 9.63828189500585568303961406863153237440702754858e-0002 413*25c28e83SPiotr Jasiukajtis * p3[3] = 8.76069421042696384852462044188520252156846768667e-0003 414*25c28e83SPiotr Jasiukajtis * p3[4] = 1.86477890389161491224872014149309015261897537488e-0003 415*25c28e83SPiotr Jasiukajtis * p3[5] = 8.16871354540309895879974742853701311541286944191e-0004 416*25c28e83SPiotr Jasiukajtis * p3[6] = 6.83783483674600322518695090864659381650125625216e-0005 417*25c28e83SPiotr Jasiukajtis * p3[7] = -1.10168269719261574708565935172719209272190828456e-0006 418*25c28e83SPiotr Jasiukajtis * p3[8] = 9.66243228508380420159234853278906717065629721016e-0007 419*25c28e83SPiotr Jasiukajtis * p3[9] = 2.31858885579177250541163820671121664974334728142e-0008 420*25c28e83SPiotr Jasiukajtis * q3[0] = 1.0 421*25c28e83SPiotr Jasiukajtis * q3[1] = 8.25479821168813634632437430090376252512793067339e-0001 422*25c28e83SPiotr Jasiukajtis * q3[2] = -1.62251363073937769739639623669295110346015576320e-0002 423*25c28e83SPiotr Jasiukajtis * q3[3] = -1.10621286905916732758745130629426559691187579852e-0001 424*25c28e83SPiotr Jasiukajtis * q3[4] = 3.48309693970985612644446415789230015515365291459e-0003 425*25c28e83SPiotr Jasiukajtis * q3[5] = 6.73553737487488333032431261131289672347043401328e-0003 426*25c28e83SPiotr Jasiukajtis * q3[6] = -7.63222008393372630162743587811004613050245128051e-0004 427*25c28e83SPiotr Jasiukajtis * q3[7] = -1.35792670669190631476784768961953711773073251336e-0004 428*25c28e83SPiotr Jasiukajtis * q3[8] = 3.19610150954223587006220730065608156460205690618e-0005 429*25c28e83SPiotr Jasiukajtis * q3[9] = -1.82096553862822346610109522015129585693354348322e-0006 430*25c28e83SPiotr Jasiukajtis * 431*25c28e83SPiotr Jasiukajtis * (C) For x between 0 and 1. 432*25c28e83SPiotr Jasiukajtis * Let P stand for the number of significant bits in the working precision. 433*25c28e83SPiotr Jasiukajtis * -P 1 434*25c28e83SPiotr Jasiukajtis * (1)For 0 <= x <= 2 , gamma(x) is computed by --- rounded to nearest. 435*25c28e83SPiotr Jasiukajtis * x 436*25c28e83SPiotr Jasiukajtis * The error is bound by 0.739 ulp(gamma(x)) in IEEE double precision. 437*25c28e83SPiotr Jasiukajtis * Proof. 438*25c28e83SPiotr Jasiukajtis * 1 2 439*25c28e83SPiotr Jasiukajtis * Since -------- ~ x + 0.577...*x - ..., we have, for small x, 440*25c28e83SPiotr Jasiukajtis * gamma(x) 441*25c28e83SPiotr Jasiukajtis * 1 1 442*25c28e83SPiotr Jasiukajtis * ----------- < gamma(x) < --- and 443*25c28e83SPiotr Jasiukajtis * x(1+0.578x) x 444*25c28e83SPiotr Jasiukajtis * 1 1 1 445*25c28e83SPiotr Jasiukajtis * 0 < --- - gamma(x) <= --- - ----------- < 0.578 446*25c28e83SPiotr Jasiukajtis * x x x(1+0.578x) 447*25c28e83SPiotr Jasiukajtis * 1 1 -P 448*25c28e83SPiotr Jasiukajtis * The error is thus bounded by --- ulp(---) + 0.578. Since x <= 2 , 449*25c28e83SPiotr Jasiukajtis * 2 x 450*25c28e83SPiotr Jasiukajtis * 1 P 1 P 1 451*25c28e83SPiotr Jasiukajtis * --- >= 2 , ulp(---) >= ulp(2 ) >= 2. Thus 0.578=0.289*2<=0.289ulp(-) 452*25c28e83SPiotr Jasiukajtis * x x x 453*25c28e83SPiotr Jasiukajtis * Thus 454*25c28e83SPiotr Jasiukajtis * 1 1 455*25c28e83SPiotr Jasiukajtis * | gamma(x) - [---] rounded | <= (0.5+0.289)*ulp(---). 456*25c28e83SPiotr Jasiukajtis * x x 457*25c28e83SPiotr Jasiukajtis * -P 1 458*25c28e83SPiotr Jasiukajtis * Note that for x<= 2 , it is easy to see that ulp(---)=ulp(gamma(x)) 459*25c28e83SPiotr Jasiukajtis * x 460*25c28e83SPiotr Jasiukajtis * n 1 461*25c28e83SPiotr Jasiukajtis * except only when x = 2 , (n<= -53). In such cases, --- is exact 462*25c28e83SPiotr Jasiukajtis * x 463*25c28e83SPiotr Jasiukajtis * and therefore the error is bounded by 464*25c28e83SPiotr Jasiukajtis * 1 465*25c28e83SPiotr Jasiukajtis * 0.298*ulp(---) = 0.298*2*ulp(gamma(x)) = 0.578ulp(gamma(x)). 466*25c28e83SPiotr Jasiukajtis * x 467*25c28e83SPiotr Jasiukajtis * Thus we conclude that the error in gamma is less than 0.739 ulp. 468*25c28e83SPiotr Jasiukajtis * 469*25c28e83SPiotr Jasiukajtis * (2)Otherwise, for x in GTi-1 (see B), let y = x-(zi-1). From (B) we obtain 470*25c28e83SPiotr Jasiukajtis * gamma(1+x) 471*25c28e83SPiotr Jasiukajtis * gamma(1+x) = gy.h + gy.l, then compute gamma(x) by -----------. 472*25c28e83SPiotr Jasiukajtis * x 473*25c28e83SPiotr Jasiukajtis * gy.h 474*25c28e83SPiotr Jasiukajtis * Implementaion note. Write x = x.h+x.l, and Let th = ----- chopped to 475*25c28e83SPiotr Jasiukajtis * x 476*25c28e83SPiotr Jasiukajtis * 20 bits, then 477*25c28e83SPiotr Jasiukajtis * gy.h+gy.l 478*25c28e83SPiotr Jasiukajtis * gamma(x) = th + (---------- - th ) 479*25c28e83SPiotr Jasiukajtis * x 480*25c28e83SPiotr Jasiukajtis * 1 481*25c28e83SPiotr Jasiukajtis * = th + ---*(gy.h-th*x.h+gy.l-th*x.l) 482*25c28e83SPiotr Jasiukajtis * x 483*25c28e83SPiotr Jasiukajtis * 484*25c28e83SPiotr Jasiukajtis * (D) For x between 2 and 8. Let n = 1+x chopped to an integer. Then 485*25c28e83SPiotr Jasiukajtis * 486*25c28e83SPiotr Jasiukajtis * gamma(x)=(x-1)*(x-2)*...*(x-n)*gamma(x-n) 487*25c28e83SPiotr Jasiukajtis * 488*25c28e83SPiotr Jasiukajtis * Since x-n is between 1 and 2, we can apply (B) to compute gamma(x). 489*25c28e83SPiotr Jasiukajtis * 490*25c28e83SPiotr Jasiukajtis * Implementation detail. The computation of (x-1)(x-2)...(x-n) in simulated 491*25c28e83SPiotr Jasiukajtis * higher precision arithmetic can be somewhat optimized. For example, in 492*25c28e83SPiotr Jasiukajtis * computing (x-1)*(x-2)*(x-3)*(x-4), if we compute (x-1)*(x-4) = z.h+z.l, 493*25c28e83SPiotr Jasiukajtis * then (x-2)(x-3) = z.h+2+z.l readily. In below, we list the expression 494*25c28e83SPiotr Jasiukajtis * of the formula to compute gamma(x). 495*25c28e83SPiotr Jasiukajtis * 496*25c28e83SPiotr Jasiukajtis * Assume x-n is in GTi (i=1,2, or 3, see B for detail). Let y = x - n - zi. 497*25c28e83SPiotr Jasiukajtis * By (B) we have gamma(x-n) = gy.h+gy.l. If x = x.h+x.l, then we have 498*25c28e83SPiotr Jasiukajtis * n=1 (x in [2,3]): 499*25c28e83SPiotr Jasiukajtis * gamma(x) = (x-1)*gamma(x-1) = (x-1)*(gy.h+gy.l) 500*25c28e83SPiotr Jasiukajtis * = [(x.h-1)+x.l]*(gy.h+gy.l) 501*25c28e83SPiotr Jasiukajtis * n=2 (x in [3,4]): 502*25c28e83SPiotr Jasiukajtis * gamma(x) = (x-1)(x-2)*gamma(x-2) = (x-1)*(x-2)*(gy.h+gy.l) 503*25c28e83SPiotr Jasiukajtis * = ((x.h-2)+x.l)*((x.h-1)+x.l)*(gy.h+gy.l) 504*25c28e83SPiotr Jasiukajtis * = [x.h*(x.h-3)+2+x.l*(x+(x.h-3))]*(gy.h+gy.l) 505*25c28e83SPiotr Jasiukajtis * n=3 (x in [4,5]) 506*25c28e83SPiotr Jasiukajtis * gamma(x) = (x-1)(x-2)(x-3)*(gy.h+gy.l) 507*25c28e83SPiotr Jasiukajtis * = (x.h*(x.h-3)+2+x.l*(x+(x.h-3)))*[((x.h-3)+x.l)(gy.h+gy.l)] 508*25c28e83SPiotr Jasiukajtis * n=4 (x in [5,6]) 509*25c28e83SPiotr Jasiukajtis * gamma(x) = [(x-1)(x-4)]*[(x-2)(x-3)]*(gy.h+gy.l) 510*25c28e83SPiotr Jasiukajtis * = [(x.h*(x.h-5)+4+x.l(x+(x.h-5)))]*[(x-2)*(x-3)]*(gy.h+gy.l) 511*25c28e83SPiotr Jasiukajtis * = (y.h+y.l)*(y.h+1+y.l)*(gy.h+gy.l) 512*25c28e83SPiotr Jasiukajtis * n=5 (x in [6,7]) 513*25c28e83SPiotr Jasiukajtis * gamma(x) = [(x-1)(x-4)]*[(x-2)(x-3)]*[(x-5)*(gy.h+gy.l)] 514*25c28e83SPiotr Jasiukajtis * n=6 (x in [7,8]) 515*25c28e83SPiotr Jasiukajtis * gamma(x) = [(x-1)(x-6)]*[(x-2)(x-5)]*[(x-3)(x-4)]*(gy.h+gy.l)] 516*25c28e83SPiotr Jasiukajtis * = [(y.h+y.l)(y.h+4+y.l)][(y.h+6+y.l)(gy.h+gy.l)] 517*25c28e83SPiotr Jasiukajtis * 518*25c28e83SPiotr Jasiukajtis * (E)Overflow Thresold. For x > Overflow thresold of gamma, 519*25c28e83SPiotr Jasiukajtis * return huge*huge (overflow). 520*25c28e83SPiotr Jasiukajtis * 521*25c28e83SPiotr Jasiukajtis * By checking whether lgamma(x) >= 2**{128,1024,16384}, one can 522*25c28e83SPiotr Jasiukajtis * determine the overflow threshold for x in single, double, and 523*25c28e83SPiotr Jasiukajtis * quad precision. See over.c for details. 524*25c28e83SPiotr Jasiukajtis * 525*25c28e83SPiotr Jasiukajtis * The overflow threshold of gamma(x) are 526*25c28e83SPiotr Jasiukajtis * 527*25c28e83SPiotr Jasiukajtis * single: x = 3.5040096283e+01 528*25c28e83SPiotr Jasiukajtis * = 0x420C290F (IEEE single) 529*25c28e83SPiotr Jasiukajtis * double: x = 1.71624376956302711505e+02 530*25c28e83SPiotr Jasiukajtis * = 0x406573FAE561F647 (IEEE double) 531*25c28e83SPiotr Jasiukajtis * quad: x = 1.7555483429044629170038892160702032034177e+03 532*25c28e83SPiotr Jasiukajtis * = 0x4009B6E3180CD66A5C4206F128BA77F4 (quad) 533*25c28e83SPiotr Jasiukajtis * 534*25c28e83SPiotr Jasiukajtis * (F)For overflow_threshold >= x >= 8, we use asymptotic approximation. 535*25c28e83SPiotr Jasiukajtis * (1) Stirling's formula 536*25c28e83SPiotr Jasiukajtis * 537*25c28e83SPiotr Jasiukajtis * log(G(x)) ~= (x-.5)*(log(x)-1) + .5(log(2*pi)-1) + (1/x)*P(1/(x*x)) 538*25c28e83SPiotr Jasiukajtis * = L1 + L2 + L3, 539*25c28e83SPiotr Jasiukajtis * where 540*25c28e83SPiotr Jasiukajtis * L1(x) = (x-.5)*(log(x)-1), 541*25c28e83SPiotr Jasiukajtis * L2 = .5(log(2pi)-1) = 0.41893853...., 542*25c28e83SPiotr Jasiukajtis * L3(x) = (1/x)P(1/(x*x)), 543*25c28e83SPiotr Jasiukajtis * 544*25c28e83SPiotr Jasiukajtis * The range of L1,L2, and L3 are as follows: 545*25c28e83SPiotr Jasiukajtis * 546*25c28e83SPiotr Jasiukajtis * ------------------------------------------------------------------ 547*25c28e83SPiotr Jasiukajtis * Range(L1) = (single) [8.09..,88.30..] =[2** 3.01..,2** 6.46..] 548*25c28e83SPiotr Jasiukajtis * (double) [8.09..,709.3..] =[2** 3.01..,2** 9.47..] 549*25c28e83SPiotr Jasiukajtis * (quad) [8.09..,11356.10..]=[2** 3.01..,2** 13.47..] 550*25c28e83SPiotr Jasiukajtis * Range(L2) = 0.41893853..... 551*25c28e83SPiotr Jasiukajtis * Range(L3) = [0.0104...., 0.00048....] =[2**-6.58..,2**-11.02..] 552*25c28e83SPiotr Jasiukajtis * ------------------------------------------------------------------ 553*25c28e83SPiotr Jasiukajtis * 554*25c28e83SPiotr Jasiukajtis * Gamma(x) is then computed by exp(L1+L2+L3). 555*25c28e83SPiotr Jasiukajtis * 556*25c28e83SPiotr Jasiukajtis * (2) Error analysis of (F): 557*25c28e83SPiotr Jasiukajtis * -------------------------- 558*25c28e83SPiotr Jasiukajtis * The error in Gamma(x) depends on the error inherited in the computation 559*25c28e83SPiotr Jasiukajtis * of L= L1+L2+L3. Let L' be the computed value of L. The absolute error 560*25c28e83SPiotr Jasiukajtis * in L' is t = L-L'. Since exp(L') = exp(L-t) = exp(L)*exp(t) ~ 561*25c28e83SPiotr Jasiukajtis * (1+t)*exp(L), the relative error in exp(L') is approximately t. 562*25c28e83SPiotr Jasiukajtis * 563*25c28e83SPiotr Jasiukajtis * To guarantee the relatively accuracy in exp(L'), we would like 564*25c28e83SPiotr Jasiukajtis * |t| < 2**(-P-5) where P denotes for the number of significant bits 565*25c28e83SPiotr Jasiukajtis * of the working precision. Consequently, each of the L1,L2, and L3 566*25c28e83SPiotr Jasiukajtis * must be computed with absolute error bounded by 2**(-P-5) in absolute 567*25c28e83SPiotr Jasiukajtis * value. 568*25c28e83SPiotr Jasiukajtis * 569*25c28e83SPiotr Jasiukajtis * Since L2 is a constant, it can be pre-computed to the desired accuracy. 570*25c28e83SPiotr Jasiukajtis * Also |L3| < 2**-6; therefore, it suffices to compute L3 with the 571*25c28e83SPiotr Jasiukajtis * working precision. That is, 572*25c28e83SPiotr Jasiukajtis * L3(x) approxmiate log(G(x))-(x-.5)(log(x)-1)-.5(log(2pi)-1) 573*25c28e83SPiotr Jasiukajtis * to a precision bounded by 2**(-P-5). 574*25c28e83SPiotr Jasiukajtis * 575*25c28e83SPiotr Jasiukajtis * 2**(-6) 576*25c28e83SPiotr Jasiukajtis * _________V___________________ 577*25c28e83SPiotr Jasiukajtis * L1(x): |_________|___________________| 578*25c28e83SPiotr Jasiukajtis * __ ________________________ 579*25c28e83SPiotr Jasiukajtis * L2: |__|________________________| 580*25c28e83SPiotr Jasiukajtis * __________________________ 581*25c28e83SPiotr Jasiukajtis * + L3(x): |__________________________| 582*25c28e83SPiotr Jasiukajtis * ------------------------------------------- 583*25c28e83SPiotr Jasiukajtis * [leading] + [Trailing] 584*25c28e83SPiotr Jasiukajtis * 585*25c28e83SPiotr Jasiukajtis * For L1(x)=(x-0.5)*(log(x)-1), we need ilogb(L1(x))+5 extra bits for 586*25c28e83SPiotr Jasiukajtis * both multiplicants to guarantee L1(x)'s absolute error is bounded by 587*25c28e83SPiotr Jasiukajtis * 2**(-P-5) in absolute value. Here ilogb(y) is defined to be the unbias 588*25c28e83SPiotr Jasiukajtis * binary exponent of y in IEEE format. We can get x-0.5 to the desire 589*25c28e83SPiotr Jasiukajtis * accuracy easily. It remains to compute log(x)-1 with ilogb(L1(x))+5 590*25c28e83SPiotr Jasiukajtis * extra bits accracy. Note that the range of L1 is 88.30.., 709.3.., and 591*25c28e83SPiotr Jasiukajtis * 11356.10... for single, double, and quadruple precision, we have 592*25c28e83SPiotr Jasiukajtis * 593*25c28e83SPiotr Jasiukajtis * single double quadruple 594*25c28e83SPiotr Jasiukajtis * ------------------------------------ 595*25c28e83SPiotr Jasiukajtis * ilogb(L1(x))+5 <= 11 14 18 596*25c28e83SPiotr Jasiukajtis * ------------------------------------ 597*25c28e83SPiotr Jasiukajtis * 598*25c28e83SPiotr Jasiukajtis * (3) Table Driven Method for log(x)-1: 599*25c28e83SPiotr Jasiukajtis * -------------------------------------- 600*25c28e83SPiotr Jasiukajtis * Let x = 2**n * y, where 1 <= y < 2. Let Z={z(i),i=1,...,m} 601*25c28e83SPiotr Jasiukajtis * be a set of predetermined evenly distributed floating point numbers 602*25c28e83SPiotr Jasiukajtis * in [1, 2]. Let z(j) be the closest one to y, then 603*25c28e83SPiotr Jasiukajtis * log(x)-1 = n*log(2)-1 + log(y) 604*25c28e83SPiotr Jasiukajtis * = n*log(2)-1 + log(z(j)*y/z(j)) 605*25c28e83SPiotr Jasiukajtis * = n*log(2)-1 + log(z(j)) + log(y/z(j)) 606*25c28e83SPiotr Jasiukajtis * = T1(n) + T2(j) + T3, 607*25c28e83SPiotr Jasiukajtis * 608*25c28e83SPiotr Jasiukajtis * where T1(n) = n*log(2)-1 and T2(j) = log(z(j)). Both T1 and T2 can be 609*25c28e83SPiotr Jasiukajtis * pre-calculated and be looked-up in a table. Note that 8 <= x < 1756 610*25c28e83SPiotr Jasiukajtis * implies 3<=n<=10 implies 1.079.. < T1(n) < 6.931. 611*25c28e83SPiotr Jasiukajtis * 612*25c28e83SPiotr Jasiukajtis * 613*25c28e83SPiotr Jasiukajtis * y-z(i) y 1+s 614*25c28e83SPiotr Jasiukajtis * For T3, let s = --------; then ----- = ----- and 615*25c28e83SPiotr Jasiukajtis * y+z(i) z(i) 1-s 616*25c28e83SPiotr Jasiukajtis * 1+s 2 3 2 5 617*25c28e83SPiotr Jasiukajtis * T3 = log(-----) = 2s + --- s + --- s + .... 618*25c28e83SPiotr Jasiukajtis * 1-s 3 5 619*25c28e83SPiotr Jasiukajtis * 620*25c28e83SPiotr Jasiukajtis * Suppose the first term 2s is compute in extra precision. The 621*25c28e83SPiotr Jasiukajtis * dominating error in T3 would then be the rounding error of the 622*25c28e83SPiotr Jasiukajtis * second term 2/3*s**3. To force the rounding bounded by 623*25c28e83SPiotr Jasiukajtis * the required accuracy, we have 624*25c28e83SPiotr Jasiukajtis * single: |2/3*s**3| < 2**-11 == > |s|<0.09014... 625*25c28e83SPiotr Jasiukajtis * double: |2/3*s**3| < 2**-14 == > |s|<0.04507... 626*25c28e83SPiotr Jasiukajtis * quad : |2/3*s**3| < 2**-18 == > |s|<0.01788... = 2**(-5.80..) 627*25c28e83SPiotr Jasiukajtis * 628*25c28e83SPiotr Jasiukajtis * Base on this analysis, we choose Z = {z(i)|z(i)=1+i/64+1/128, 0<=i<=63}. 629*25c28e83SPiotr Jasiukajtis * For any y in [1,2), let j = [64*y] chopped to integer, then z(j) is 630*25c28e83SPiotr Jasiukajtis * the closest to y, and it is not difficult to see that |s| < 2**(-8). 631*25c28e83SPiotr Jasiukajtis * Please note that the polynomial approximation of T3 must be accurate 632*25c28e83SPiotr Jasiukajtis * -24-11 -35 -53-14 -67 -113-18 -131 633*25c28e83SPiotr Jasiukajtis * to 2 =2 , 2 = 2 , and 2 =2 634*25c28e83SPiotr Jasiukajtis * for single, double, and quadruple precision respectively. 635*25c28e83SPiotr Jasiukajtis * 636*25c28e83SPiotr Jasiukajtis * Inplementation notes. 637*25c28e83SPiotr Jasiukajtis * (1) Table look-up entries for T1(n) and T2(j), as well as the calculation 638*25c28e83SPiotr Jasiukajtis * of the leading term 2s in T3, are broken up into leading and trailing 639*25c28e83SPiotr Jasiukajtis * part such that (leading part)* 2**24 will always be an integer. That 640*25c28e83SPiotr Jasiukajtis * will guarantee the addition of the leading parts will be exact. 641*25c28e83SPiotr Jasiukajtis * 642*25c28e83SPiotr Jasiukajtis * 2**(-24) 643*25c28e83SPiotr Jasiukajtis * _________V___________________ 644*25c28e83SPiotr Jasiukajtis * T1(n): |_________|___________________| 645*25c28e83SPiotr Jasiukajtis * _______ ______________________ 646*25c28e83SPiotr Jasiukajtis * T2(j): |_______|______________________| 647*25c28e83SPiotr Jasiukajtis * ____ _______________________ 648*25c28e83SPiotr Jasiukajtis * 2s: |____|_______________________| 649*25c28e83SPiotr Jasiukajtis * __________________________ 650*25c28e83SPiotr Jasiukajtis * + T3(s)-2s: |__________________________| 651*25c28e83SPiotr Jasiukajtis * ------------------------------------------- 652*25c28e83SPiotr Jasiukajtis * [leading] + [Trailing] 653*25c28e83SPiotr Jasiukajtis * 654*25c28e83SPiotr Jasiukajtis * (2) How to compute 2s accurately. 655*25c28e83SPiotr Jasiukajtis * (A) Compute v = 2s to the working precision. If |v| < 2**(-18), 656*25c28e83SPiotr Jasiukajtis * stop. 657*25c28e83SPiotr Jasiukajtis * (B) chopped v to 2**(-24): v = ((int)(v*2**24))/2**24 658*25c28e83SPiotr Jasiukajtis * (C) 2s = v + (2s - v), where 659*25c28e83SPiotr Jasiukajtis * 1 660*25c28e83SPiotr Jasiukajtis * 2s - v = --- * (2(y-z) - v*(y+z) ) 661*25c28e83SPiotr Jasiukajtis * y+z 662*25c28e83SPiotr Jasiukajtis * 1 663*25c28e83SPiotr Jasiukajtis * = --- * ( [2(y-z) - v*(y+z)_h ] - v*(y+z)_l ) 664*25c28e83SPiotr Jasiukajtis * y+z 665*25c28e83SPiotr Jasiukajtis * where (y+z)_h = (y+z) rounded to 24 bits by (double)(float), 666*25c28e83SPiotr Jasiukajtis * and (y+z)_l = ((z+z)-(y+z)_h)+(y-z). Note the the quantity 667*25c28e83SPiotr Jasiukajtis * in [] is exact. 668*25c28e83SPiotr Jasiukajtis * 2 4 669*25c28e83SPiotr Jasiukajtis * (3) Remez approximation for (T3(s)-2s)/s = T3[0]*s + T3[1]*s + ...: 670*25c28e83SPiotr Jasiukajtis * Single precision: 1 term (compute in double precision arithmetic) 671*25c28e83SPiotr Jasiukajtis * T3(s) = 2s + S1*s^3, S1 = 0.6666717231848518054693623697539230 672*25c28e83SPiotr Jasiukajtis * Remez error: |T3(s)/s - (2s+S1*s^3)| < 2**(-35.87) 673*25c28e83SPiotr Jasiukajtis * Double precision: 3 terms, Remez error is bounded by 2**(-72.40), 674*25c28e83SPiotr Jasiukajtis * see "tgamma_log" 675*25c28e83SPiotr Jasiukajtis * Quad precision: 7 terms, Remez error is bounded by 2**(-136.54), 676*25c28e83SPiotr Jasiukajtis * see "tgammal_log" 677*25c28e83SPiotr Jasiukajtis * 678*25c28e83SPiotr Jasiukajtis * The computation of 0.5*(ln(2pi)-1): 679*25c28e83SPiotr Jasiukajtis * 0.5*(ln(2pi)-1) = 0.4189385332046727417803297364056176398614... 680*25c28e83SPiotr Jasiukajtis * split 0.5*(ln(2pi)-1) to hln2pi_h + hln2pi_l, where hln2pi_h is the 681*25c28e83SPiotr Jasiukajtis * leading 21 bits of the constant. 682*25c28e83SPiotr Jasiukajtis * hln2pi_h= 0.4189383983612060546875 683*25c28e83SPiotr Jasiukajtis * hln2pi_l= 1.348434666870928297364056176398612173648e-07 684*25c28e83SPiotr Jasiukajtis * 685*25c28e83SPiotr Jasiukajtis * The computation of 1/x*P(1/x^2) = log(G(x))-(x-.5)(ln(x)-1)-(.5ln(2pi)-1): 686*25c28e83SPiotr Jasiukajtis * Let s = 1/x <= 1/8 < 0.125. We have 687*25c28e83SPiotr Jasiukajtis * quad precision 688*25c28e83SPiotr Jasiukajtis * |GP(s) - s*P(s^2)| <= 2**(-120.6), where 689*25c28e83SPiotr Jasiukajtis * 3 5 39 690*25c28e83SPiotr Jasiukajtis * GP(s) = GP0*s+GP1*s +GP2*s +... +GP19*s , 691*25c28e83SPiotr Jasiukajtis * GP0 = 0.083333333333333333333333333333333172839171301 692*25c28e83SPiotr Jasiukajtis * hex 0x3ffe5555 55555555 55555555 55555548 693*25c28e83SPiotr Jasiukajtis * GP1 = -2.77777777777777777777777777492501211999399424104e-0003 694*25c28e83SPiotr Jasiukajtis * GP2 = 7.93650793650793650793635650541638236350020883243e-0004 695*25c28e83SPiotr Jasiukajtis * GP3 = -5.95238095238095238057299772679324503339241961704e-0004 696*25c28e83SPiotr Jasiukajtis * GP4 = 8.41750841750841696138422987977683524926142600321e-0004 697*25c28e83SPiotr Jasiukajtis * GP5 = -1.91752691752686682825032547823699662178842123308e-0003 698*25c28e83SPiotr Jasiukajtis * GP6 = 6.41025641022403480921891559356473451161279359322e-0003 699*25c28e83SPiotr Jasiukajtis * GP7 = -2.95506535798414019189819587455577003732808185071e-0002 700*25c28e83SPiotr Jasiukajtis * GP8 = 1.79644367229970031486079180060923073476568732136e-0001 701*25c28e83SPiotr Jasiukajtis * GP9 = -1.39243086487274662174562872567057200255649290646e+0000 702*25c28e83SPiotr Jasiukajtis * GP10 = 1.34025874044417962188677816477842265259608269775e+0001 703*25c28e83SPiotr Jasiukajtis * GP11 = -1.56803713480127469414495545399982508700748274318e+0002 704*25c28e83SPiotr Jasiukajtis * GP12 = 2.18739841656201561694927630335099313968924493891e+0003 705*25c28e83SPiotr Jasiukajtis * GP13 = -3.55249848644100338419187038090925410976237921269e+0004 706*25c28e83SPiotr Jasiukajtis * GP14 = 6.43464880437835286216768959439484376449179576452e+0005 707*25c28e83SPiotr Jasiukajtis * GP15 = -1.20459154385577014992600342782821389605893904624e+0007 708*25c28e83SPiotr Jasiukajtis * GP16 = 2.09263249637351298563934942349749718491071093210e+0008 709*25c28e83SPiotr Jasiukajtis * GP17 = -2.96247483183169219343745316433899599834685703457e+0009 710*25c28e83SPiotr Jasiukajtis * GP18 = 2.88984933605896033154727626086506756972327292981e+0010 711*25c28e83SPiotr Jasiukajtis * GP19 = -1.40960434146030007732838382416230610302678063984e+0011 712*25c28e83SPiotr Jasiukajtis * 713*25c28e83SPiotr Jasiukajtis * double precision 714*25c28e83SPiotr Jasiukajtis * |GP(s) - s*P(s^2)| <= 2**(-63.5), where 715*25c28e83SPiotr Jasiukajtis * 3 5 7 9 11 13 15 716*25c28e83SPiotr Jasiukajtis * GP(s) = GP0*s+GP1*s +GP2*s +GP3*s +GP4*s +GP5*s +GP6*s +GP7*s , 717*25c28e83SPiotr Jasiukajtis * 718*25c28e83SPiotr Jasiukajtis * GP0= 0.0833333333333333287074040640618477 (3FB55555 55555555) 719*25c28e83SPiotr Jasiukajtis * GP1= -2.77777777776649355200565611114627670089130772843e-0003 720*25c28e83SPiotr Jasiukajtis * GP2= 7.93650787486083724805476194170211775784158551509e-0004 721*25c28e83SPiotr Jasiukajtis * GP3= -5.95236628558314928757811419580281294593903582971e-0004 722*25c28e83SPiotr Jasiukajtis * GP4= 8.41566473999853451983137162780427812781178932540e-0004 723*25c28e83SPiotr Jasiukajtis * GP5= -1.90424776670441373564512942038926168175921303212e-0003 724*25c28e83SPiotr Jasiukajtis * GP6= 5.84933161530949666312333949534482303007354299178e-0003 725*25c28e83SPiotr Jasiukajtis * GP7= -1.59453228931082030262124832506144392496561694550e-0002 726*25c28e83SPiotr Jasiukajtis * single precision 727*25c28e83SPiotr Jasiukajtis * |GP(s) - s*P(s^2)| <= 2**(-37.78), where 728*25c28e83SPiotr Jasiukajtis * 3 5 729*25c28e83SPiotr Jasiukajtis * GP(s) = GP0*s+GP1*s +GP2*s 730*25c28e83SPiotr Jasiukajtis * GP0 = 8.33333330959694065245736888749042811909994573178e-0002 731*25c28e83SPiotr Jasiukajtis * GP1 = -2.77765545601667179767706600890361535225507762168e-0003 732*25c28e83SPiotr Jasiukajtis * GP2 = 7.77830853479775281781085278324621033523037489883e-0004 733*25c28e83SPiotr Jasiukajtis * 734*25c28e83SPiotr Jasiukajtis * 735*25c28e83SPiotr Jasiukajtis * Implementation note: 736*25c28e83SPiotr Jasiukajtis * z = (1/x), z2 = z*z, z4 = z2*z2; 737*25c28e83SPiotr Jasiukajtis * p = z*(GP0+z2*(GP1+....+z2*GP7)) 738*25c28e83SPiotr Jasiukajtis * = z*(GP0+(z4*(GP2+z4*(GP4+z4*GP6))+z2*(GP1+z4*(GP3+z4*(GP5+z4*GP7))))) 739*25c28e83SPiotr Jasiukajtis * 740*25c28e83SPiotr Jasiukajtis * Adding everything up: 741*25c28e83SPiotr Jasiukajtis * t = rr.h*ww.h+hln2pi_h ... exact 742*25c28e83SPiotr Jasiukajtis * w = (hln2pi_l + ((x-0.5)*ww.l+rr.l*ww.h)) + p 743*25c28e83SPiotr Jasiukajtis * 744*25c28e83SPiotr Jasiukajtis * Computing exp(t+w): 745*25c28e83SPiotr Jasiukajtis * s = t+w; write s = (n+j/32)*ln2+r, |r|<=(1/64)*ln2, then 746*25c28e83SPiotr Jasiukajtis * exp(s) = 2**n * (2**(j/32) + 2**(j/32)*expm1(r)), where 747*25c28e83SPiotr Jasiukajtis * expm1(r) = r + Et1*r^2 + Et2*r^3 + ... + Et5*r^6, and 748*25c28e83SPiotr Jasiukajtis * 2**(j/32) is obtained by table look-up S[j]+S_trail[j]. 749*25c28e83SPiotr Jasiukajtis * Remez error bound: 750*25c28e83SPiotr Jasiukajtis * |exp(r) - (1+r+Et1*r^2+...+Et5*r^6)| <= 2^(-63). 751*25c28e83SPiotr Jasiukajtis */ 752*25c28e83SPiotr Jasiukajtis 753*25c28e83SPiotr Jasiukajtis #include "libm.h" 754*25c28e83SPiotr Jasiukajtis 755*25c28e83SPiotr Jasiukajtis #define __HI(x) ((int *) &x)[HIWORD] 756*25c28e83SPiotr Jasiukajtis #define __LO(x) ((unsigned *) &x)[LOWORD] 757*25c28e83SPiotr Jasiukajtis 758*25c28e83SPiotr Jasiukajtis struct Double { 759*25c28e83SPiotr Jasiukajtis double h; 760*25c28e83SPiotr Jasiukajtis double l; 761*25c28e83SPiotr Jasiukajtis }; 762*25c28e83SPiotr Jasiukajtis 763*25c28e83SPiotr Jasiukajtis /* Hex value of GP0 shoule be 3FB55555 55555555 */ 764*25c28e83SPiotr Jasiukajtis static const double c[] = { 765*25c28e83SPiotr Jasiukajtis +1.0, 766*25c28e83SPiotr Jasiukajtis +2.0, 767*25c28e83SPiotr Jasiukajtis +0.5, 768*25c28e83SPiotr Jasiukajtis +1.0e-300, 769*25c28e83SPiotr Jasiukajtis +6.66666666666666740682e-01, /* A1=T3[0] */ 770*25c28e83SPiotr Jasiukajtis +3.99999999955626478023093908674902212920e-01, /* A2=T3[1] */ 771*25c28e83SPiotr Jasiukajtis +2.85720221533145659809237398709372330980e-01, /* A3=T3[2] */ 772*25c28e83SPiotr Jasiukajtis +0.0833333333333333287074040640618477, /* GP[0] */ 773*25c28e83SPiotr Jasiukajtis -2.77777777776649355200565611114627670089130772843e-03, 774*25c28e83SPiotr Jasiukajtis +7.93650787486083724805476194170211775784158551509e-04, 775*25c28e83SPiotr Jasiukajtis -5.95236628558314928757811419580281294593903582971e-04, 776*25c28e83SPiotr Jasiukajtis +8.41566473999853451983137162780427812781178932540e-04, 777*25c28e83SPiotr Jasiukajtis -1.90424776670441373564512942038926168175921303212e-03, 778*25c28e83SPiotr Jasiukajtis +5.84933161530949666312333949534482303007354299178e-03, 779*25c28e83SPiotr Jasiukajtis -1.59453228931082030262124832506144392496561694550e-02, 780*25c28e83SPiotr Jasiukajtis +4.18937683105468750000e-01, /* hln2pi_h */ 781*25c28e83SPiotr Jasiukajtis +8.50099203991780279640e-07, /* hln2pi_l */ 782*25c28e83SPiotr Jasiukajtis +4.18938533204672741744150788368695779923320328369e-01, /* hln2pi */ 783*25c28e83SPiotr Jasiukajtis +2.16608493865351192653e-02, /* ln2_32hi */ 784*25c28e83SPiotr Jasiukajtis +5.96317165397058656257e-12, /* ln2_32lo */ 785*25c28e83SPiotr Jasiukajtis +4.61662413084468283841e+01, /* invln2_32 */ 786*25c28e83SPiotr Jasiukajtis +5.0000000000000000000e-1, /* Et1 */ 787*25c28e83SPiotr Jasiukajtis +1.66666666665223585560605991943703896196054020060e-01, /* Et2 */ 788*25c28e83SPiotr Jasiukajtis +4.16666666665895103520154073534275286743788421687e-02, /* Et3 */ 789*25c28e83SPiotr Jasiukajtis +8.33336844093536520775865096538773197505523826029e-03, /* Et4 */ 790*25c28e83SPiotr Jasiukajtis +1.38889201930843436040204096950052984793587640227e-03, /* Et5 */ 791*25c28e83SPiotr Jasiukajtis }; 792*25c28e83SPiotr Jasiukajtis 793*25c28e83SPiotr Jasiukajtis #define one c[0] 794*25c28e83SPiotr Jasiukajtis #define two c[1] 795*25c28e83SPiotr Jasiukajtis #define half c[2] 796*25c28e83SPiotr Jasiukajtis #define tiny c[3] 797*25c28e83SPiotr Jasiukajtis #define A1 c[4] 798*25c28e83SPiotr Jasiukajtis #define A2 c[5] 799*25c28e83SPiotr Jasiukajtis #define A3 c[6] 800*25c28e83SPiotr Jasiukajtis #define GP0 c[7] 801*25c28e83SPiotr Jasiukajtis #define GP1 c[8] 802*25c28e83SPiotr Jasiukajtis #define GP2 c[9] 803*25c28e83SPiotr Jasiukajtis #define GP3 c[10] 804*25c28e83SPiotr Jasiukajtis #define GP4 c[11] 805*25c28e83SPiotr Jasiukajtis #define GP5 c[12] 806*25c28e83SPiotr Jasiukajtis #define GP6 c[13] 807*25c28e83SPiotr Jasiukajtis #define GP7 c[14] 808*25c28e83SPiotr Jasiukajtis #define hln2pi_h c[15] 809*25c28e83SPiotr Jasiukajtis #define hln2pi_l c[16] 810*25c28e83SPiotr Jasiukajtis #define hln2pi c[17] 811*25c28e83SPiotr Jasiukajtis #define ln2_32hi c[18] 812*25c28e83SPiotr Jasiukajtis #define ln2_32lo c[19] 813*25c28e83SPiotr Jasiukajtis #define invln2_32 c[20] 814*25c28e83SPiotr Jasiukajtis #define Et1 c[21] 815*25c28e83SPiotr Jasiukajtis #define Et2 c[22] 816*25c28e83SPiotr Jasiukajtis #define Et3 c[23] 817*25c28e83SPiotr Jasiukajtis #define Et4 c[24] 818*25c28e83SPiotr Jasiukajtis #define Et5 c[25] 819*25c28e83SPiotr Jasiukajtis 820*25c28e83SPiotr Jasiukajtis /* 821*25c28e83SPiotr Jasiukajtis * double precision coefficients for computing log(x)-1 in tgamma. 822*25c28e83SPiotr Jasiukajtis * See "algorithm" for details 823*25c28e83SPiotr Jasiukajtis * 824*25c28e83SPiotr Jasiukajtis * log(x) - 1 = T1(n) + T2(j) + T3(s), where x = 2**n * y, 1<=y<2, 825*25c28e83SPiotr Jasiukajtis * j=[64*y], z[j]=1+j/64+1/128, s = (y-z[j])/(y+z[j]), and 826*25c28e83SPiotr Jasiukajtis * T1(n) = T1[2n,2n+1] = n*log(2)-1, 827*25c28e83SPiotr Jasiukajtis * T2(j) = T2[2j,2j+1] = log(z[j]), 828*25c28e83SPiotr Jasiukajtis * T3(s) = 2s + T3[0]s^3 + T3[1]s^5 + T3[2]s^7 829*25c28e83SPiotr Jasiukajtis * = 2s + A1*s^3 + A2*s^5 + A3*s^7 (see const A1,A2,A3) 830*25c28e83SPiotr Jasiukajtis * Note 831*25c28e83SPiotr Jasiukajtis * (1) the leading entries are truncated to 24 binary point. 832*25c28e83SPiotr Jasiukajtis * See Remezpak/sun/tgamma_log_64.c 833*25c28e83SPiotr Jasiukajtis * (2) Remez error for T3(s) is bounded by 2**(-72.4) 834*25c28e83SPiotr Jasiukajtis * See mpremez/work/Log/tgamma_log_4_outr2 835*25c28e83SPiotr Jasiukajtis */ 836*25c28e83SPiotr Jasiukajtis 837*25c28e83SPiotr Jasiukajtis static const double T1[] = { 838*25c28e83SPiotr Jasiukajtis -1.00000000000000000000e+00, /* 0xBFF00000 0x00000000 */ 839*25c28e83SPiotr Jasiukajtis +0.00000000000000000000e+00, /* 0x00000000 0x00000000 */ 840*25c28e83SPiotr Jasiukajtis -3.06852817535400390625e-01, /* 0xBFD3A37A 0x00000000 */ 841*25c28e83SPiotr Jasiukajtis -1.90465429995776763166e-09, /* 0xBE205C61 0x0CA86C38 */ 842*25c28e83SPiotr Jasiukajtis +3.86294305324554443359e-01, /* 0x3FD8B90B 0xC0000000 */ 843*25c28e83SPiotr Jasiukajtis +5.57953361754750897367e-08, /* 0x3E6DF473 0xDE6AF279 */ 844*25c28e83SPiotr Jasiukajtis +1.07944148778915405273e+00, /* 0x3FF14564 0x70000000 */ 845*25c28e83SPiotr Jasiukajtis +5.38906818755173187963e-08, /* 0x3E6CEEAD 0xCDA06BB5 */ 846*25c28e83SPiotr Jasiukajtis +1.77258867025375366211e+00, /* 0x3FFC5C85 0xF0000000 */ 847*25c28e83SPiotr Jasiukajtis +5.19860275755595544734e-08, /* 0x3E6BE8E7 0xBCD5E4F2 */ 848*25c28e83SPiotr Jasiukajtis +2.46573585271835327148e+00, /* 0x4003B9D3 0xB8000000 */ 849*25c28e83SPiotr Jasiukajtis +5.00813732756017835330e-08, /* 0x3E6AE321 0xAC0B5E2E */ 850*25c28e83SPiotr Jasiukajtis +3.15888303518295288086e+00, /* 0x40094564 0x78000000 */ 851*25c28e83SPiotr Jasiukajtis +4.81767189756440192100e-08, /* 0x3E69DD5B 0x9B40D76B */ 852*25c28e83SPiotr Jasiukajtis +3.85203021764755249023e+00, /* 0x400ED0F5 0x38000000 */ 853*25c28e83SPiotr Jasiukajtis +4.62720646756862482697e-08, /* 0x3E68D795 0x8A7650A7 */ 854*25c28e83SPiotr Jasiukajtis +4.54517740011215209961e+00, /* 0x40122E42 0xFC000000 */ 855*25c28e83SPiotr Jasiukajtis +4.43674103757284839467e-08, /* 0x3E67D1CF 0x79ABC9E4 */ 856*25c28e83SPiotr Jasiukajtis +5.23832458257675170898e+00, /* 0x4014F40B 0x5C000000 */ 857*25c28e83SPiotr Jasiukajtis +4.24627560757707130063e-08, /* 0x3E66CC09 0x68E14320 */ 858*25c28e83SPiotr Jasiukajtis +5.93147176504135131836e+00, /* 0x4017B9D3 0xBC000000 */ 859*25c28e83SPiotr Jasiukajtis +4.05581017758129486834e-08, /* 0x3E65C643 0x5816BC5D */ 860*25c28e83SPiotr Jasiukajtis }; 861*25c28e83SPiotr Jasiukajtis 862*25c28e83SPiotr Jasiukajtis static const double T2[] = { 863*25c28e83SPiotr Jasiukajtis +7.78210163116455078125e-03, /* 0x3F7FE020 0x00000000 */ 864*25c28e83SPiotr Jasiukajtis +3.88108903981662140884e-08, /* 0x3E64D620 0xCF11F86F */ 865*25c28e83SPiotr Jasiukajtis +2.31670141220092773438e-02, /* 0x3F97B918 0x00000000 */ 866*25c28e83SPiotr Jasiukajtis +4.51595251008850513740e-08, /* 0x3E683EAD 0x88D54940 */ 867*25c28e83SPiotr Jasiukajtis +3.83188128471374511719e-02, /* 0x3FA39E86 0x00000000 */ 868*25c28e83SPiotr Jasiukajtis +5.14549991480218823411e-08, /* 0x3E6B9FEB 0xD5FA9016 */ 869*25c28e83SPiotr Jasiukajtis +5.32444715499877929688e-02, /* 0x3FAB42DC 0x00000000 */ 870*25c28e83SPiotr Jasiukajtis +4.29688244898971182165e-08, /* 0x3E671197 0x1BEC28D1 */ 871*25c28e83SPiotr Jasiukajtis +6.79506063461303710938e-02, /* 0x3FB16536 0x00000000 */ 872*25c28e83SPiotr Jasiukajtis +5.55623773783008185114e-08, /* 0x3E6DD46F 0x5C1D0C4C */ 873*25c28e83SPiotr Jasiukajtis +8.24436545372009277344e-02, /* 0x3FB51B07 0x00000000 */ 874*25c28e83SPiotr Jasiukajtis +1.46738736635337847313e-08, /* 0x3E4F830C 0x1FB493C7 */ 875*25c28e83SPiotr Jasiukajtis +9.67295765876770019531e-02, /* 0x3FB8C345 0x00000000 */ 876*25c28e83SPiotr Jasiukajtis +4.98708741103424492282e-08, /* 0x3E6AC633 0x641EB597 */ 877*25c28e83SPiotr Jasiukajtis +1.10814332962036132812e-01, /* 0x3FBC5E54 0x00000000 */ 878*25c28e83SPiotr Jasiukajtis +3.33782539813823062226e-08, /* 0x3E61EB78 0xE862BAC3 */ 879*25c28e83SPiotr Jasiukajtis +1.24703466892242431641e-01, /* 0x3FBFEC91 0x00000000 */ 880*25c28e83SPiotr Jasiukajtis +1.16087148042227818450e-08, /* 0x3E48EDF5 0x5D551729 */ 881*25c28e83SPiotr Jasiukajtis +1.38402283191680908203e-01, /* 0x3FC1B72A 0x80000000 */ 882*25c28e83SPiotr Jasiukajtis +3.96674382274822001957e-08, /* 0x3E654BD9 0xE80A4181 */ 883*25c28e83SPiotr Jasiukajtis +1.51916027069091796875e-01, /* 0x3FC371FC 0x00000000 */ 884*25c28e83SPiotr Jasiukajtis +1.49567501781968021494e-08, /* 0x3E500F47 0xBA1DE6CB */ 885*25c28e83SPiotr Jasiukajtis +1.65249526500701904297e-01, /* 0x3FC526E5 0x80000000 */ 886*25c28e83SPiotr Jasiukajtis +4.63946052585787334062e-08, /* 0x3E68E86D 0x0DE8B900 */ 887*25c28e83SPiotr Jasiukajtis +1.78407609462738037109e-01, /* 0x3FC6D60F 0x80000000 */ 888*25c28e83SPiotr Jasiukajtis +4.80100802600100279538e-08, /* 0x3E69C674 0x8723551E */ 889*25c28e83SPiotr Jasiukajtis +1.91394805908203125000e-01, /* 0x3FC87FA0 0x00000000 */ 890*25c28e83SPiotr Jasiukajtis +4.70914263296092971436e-08, /* 0x3E694832 0x44240802 */ 891*25c28e83SPiotr Jasiukajtis +2.04215526580810546875e-01, /* 0x3FCA23BC 0x00000000 */ 892*25c28e83SPiotr Jasiukajtis +1.48478803446288209001e-08, /* 0x3E4FE2B5 0x63193712 */ 893*25c28e83SPiotr Jasiukajtis +2.16873884201049804688e-01, /* 0x3FCBC286 0x00000000 */ 894*25c28e83SPiotr Jasiukajtis +5.40995645549315919488e-08, /* 0x3E6D0B63 0x358A7E74 */ 895*25c28e83SPiotr Jasiukajtis +2.29374051094055175781e-01, /* 0x3FCD5C21 0x00000000 */ 896*25c28e83SPiotr Jasiukajtis +4.99707906542102284117e-08, /* 0x3E6AD3EE 0xE456E443 */ 897*25c28e83SPiotr Jasiukajtis +2.41719901561737060547e-01, /* 0x3FCEF0AD 0x80000000 */ 898*25c28e83SPiotr Jasiukajtis +3.53254081075974352804e-08, /* 0x3E62F716 0x4D948638 */ 899*25c28e83SPiotr Jasiukajtis +2.53915190696716308594e-01, /* 0x3FD04025 0x80000000 */ 900*25c28e83SPiotr Jasiukajtis +1.92842471355435739091e-08, /* 0x3E54B4D0 0x40DAE27C */ 901*25c28e83SPiotr Jasiukajtis +2.65963494777679443359e-01, /* 0x3FD1058B 0xC0000000 */ 902*25c28e83SPiotr Jasiukajtis +5.37194584979797487125e-08, /* 0x3E6CD725 0x6A8C4FD0 */ 903*25c28e83SPiotr Jasiukajtis +2.77868449687957763672e-01, /* 0x3FD1C898 0xC0000000 */ 904*25c28e83SPiotr Jasiukajtis +1.31549854251447496506e-09, /* 0x3E16999F 0xAFBC68E7 */ 905*25c28e83SPiotr Jasiukajtis +2.89633274078369140625e-01, /* 0x3FD2895A 0x00000000 */ 906*25c28e83SPiotr Jasiukajtis +1.85046735362538929911e-08, /* 0x3E53DE86 0xA35EB493 */ 907*25c28e83SPiotr Jasiukajtis +3.01261305809020996094e-01, /* 0x3FD347DD 0x80000000 */ 908*25c28e83SPiotr Jasiukajtis +2.47691407849191245052e-08, /* 0x3E5A987D 0x54D64567 */ 909*25c28e83SPiotr Jasiukajtis +3.12755703926086425781e-01, /* 0x3FD40430 0x80000000 */ 910*25c28e83SPiotr Jasiukajtis +6.07781046260499658610e-09, /* 0x3E3A1A9F 0x8EF4304A */ 911*25c28e83SPiotr Jasiukajtis +3.24119448661804199219e-01, /* 0x3FD4BE5F 0x80000000 */ 912*25c28e83SPiotr Jasiukajtis +1.99924077768719198045e-08, /* 0x3E557778 0xA0DB4C99 */ 913*25c28e83SPiotr Jasiukajtis +3.35355520248413085938e-01, /* 0x3FD57677 0x00000000 */ 914*25c28e83SPiotr Jasiukajtis +2.16727247443196802771e-08, /* 0x3E57455A 0x6C549AB7 */ 915*25c28e83SPiotr Jasiukajtis +3.46466720104217529297e-01, /* 0x3FD62C82 0xC0000000 */ 916*25c28e83SPiotr Jasiukajtis +4.72419910516215900493e-08, /* 0x3E695CE3 0xCA97B7B0 */ 917*25c28e83SPiotr Jasiukajtis +3.57455849647521972656e-01, /* 0x3FD6E08E 0x80000000 */ 918*25c28e83SPiotr Jasiukajtis +3.92742818015697624778e-08, /* 0x3E6515D0 0xF1C609CA */ 919*25c28e83SPiotr Jasiukajtis +3.68325531482696533203e-01, /* 0x3FD792A5 0x40000000 */ 920*25c28e83SPiotr Jasiukajtis +2.96760111198451042238e-08, /* 0x3E5FDD47 0xA27C15DA */ 921*25c28e83SPiotr Jasiukajtis +3.79078328609466552734e-01, /* 0x3FD842D1 0xC0000000 */ 922*25c28e83SPiotr Jasiukajtis +2.43255029056564770289e-08, /* 0x3E5A1E8B 0x17493B14 */ 923*25c28e83SPiotr Jasiukajtis +3.89716744422912597656e-01, /* 0x3FD8F11E 0x80000000 */ 924*25c28e83SPiotr Jasiukajtis +6.71711261571421332726e-09, /* 0x3E3CD98B 0x1DF85DA7 */ 925*25c28e83SPiotr Jasiukajtis +4.00243163108825683594e-01, /* 0x3FD99D95 0x80000000 */ 926*25c28e83SPiotr Jasiukajtis +1.01818702333557515008e-09, /* 0x3E117E08 0xACBA92EF */ 927*25c28e83SPiotr Jasiukajtis +4.10659909248352050781e-01, /* 0x3FDA4840 0x80000000 */ 928*25c28e83SPiotr Jasiukajtis +1.57369163351530571459e-08, /* 0x3E50E5BB 0x0A2BFCA7 */ 929*25c28e83SPiotr Jasiukajtis +4.20969247817993164062e-01, /* 0x3FDAF129 0x00000000 */ 930*25c28e83SPiotr Jasiukajtis +4.68261364720663662040e-08, /* 0x3E6923BC 0x358899C2 */ 931*25c28e83SPiotr Jasiukajtis +4.31173443794250488281e-01, /* 0x3FDB9858 0x80000000 */ 932*25c28e83SPiotr Jasiukajtis +2.10241208525779214510e-08, /* 0x3E569310 0xFB598FB1 */ 933*25c28e83SPiotr Jasiukajtis +4.41274523735046386719e-01, /* 0x3FDC3DD7 0x80000000 */ 934*25c28e83SPiotr Jasiukajtis +3.70698288427707487748e-08, /* 0x3E63E6D6 0xA6B9D9E1 */ 935*25c28e83SPiotr Jasiukajtis +4.51274633407592773438e-01, /* 0x3FDCE1AF 0x00000000 */ 936*25c28e83SPiotr Jasiukajtis +1.07318658117071930723e-08, /* 0x3E470BE7 0xD6F6FA58 */ 937*25c28e83SPiotr Jasiukajtis +4.61175680160522460938e-01, /* 0x3FDD83E7 0x00000000 */ 938*25c28e83SPiotr Jasiukajtis +3.49616477054305011286e-08, /* 0x3E62C517 0x9F2828AE */ 939*25c28e83SPiotr Jasiukajtis +4.70979690551757812500e-01, /* 0x3FDE2488 0x00000000 */ 940*25c28e83SPiotr Jasiukajtis +2.46670332000468969567e-08, /* 0x3E5A7C6C 0x261CBD8F */ 941*25c28e83SPiotr Jasiukajtis +4.80688512325286865234e-01, /* 0x3FDEC399 0xC0000000 */ 942*25c28e83SPiotr Jasiukajtis +1.70204650424422423704e-08, /* 0x3E52468C 0xC0175CEE */ 943*25c28e83SPiotr Jasiukajtis +4.90303933620452880859e-01, /* 0x3FDF6123 0xC0000000 */ 944*25c28e83SPiotr Jasiukajtis +5.44247409572909703749e-08, /* 0x3E6D3814 0x5630A2B6 */ 945*25c28e83SPiotr Jasiukajtis +4.99827861785888671875e-01, /* 0x3FDFFD2E 0x00000000 */ 946*25c28e83SPiotr Jasiukajtis +7.77056065794633071345e-09, /* 0x3E40AFE9 0x30AB2FA0 */ 947*25c28e83SPiotr Jasiukajtis +5.09261846542358398438e-01, /* 0x3FE04BDF 0x80000000 */ 948*25c28e83SPiotr Jasiukajtis +5.52474495483665749052e-08, /* 0x3E6DA926 0xD265FCC1 */ 949*25c28e83SPiotr Jasiukajtis +5.18607735633850097656e-01, /* 0x3FE0986F 0x40000000 */ 950*25c28e83SPiotr Jasiukajtis +2.85741955344967264536e-08, /* 0x3E5EAE6A 0x41723FB5 */ 951*25c28e83SPiotr Jasiukajtis +5.27867078781127929688e-01, /* 0x3FE0E449 0x80000000 */ 952*25c28e83SPiotr Jasiukajtis +1.08397144554263914271e-08, /* 0x3E474732 0x2FDBAB97 */ 953*25c28e83SPiotr Jasiukajtis +5.37041425704956054688e-01, /* 0x3FE12F71 0x80000000 */ 954*25c28e83SPiotr Jasiukajtis +4.01919275998792285777e-08, /* 0x3E6593EF 0xBC530123 */ 955*25c28e83SPiotr Jasiukajtis +5.46132385730743408203e-01, /* 0x3FE179EA 0xA0000000 */ 956*25c28e83SPiotr Jasiukajtis +5.18673922421792693237e-08, /* 0x3E6BD899 0xA0BFC60E */ 957*25c28e83SPiotr Jasiukajtis +5.55141448974609375000e-01, /* 0x3FE1C3B8 0x00000000 */ 958*25c28e83SPiotr Jasiukajtis +5.85658922177154808539e-08, /* 0x3E6F713C 0x24BC94F9 */ 959*25c28e83SPiotr Jasiukajtis +5.64070105552673339844e-01, /* 0x3FE20CDC 0xC0000000 */ 960*25c28e83SPiotr Jasiukajtis +3.27321296262276338905e-08, /* 0x3E6192AB 0x6D93503D */ 961*25c28e83SPiotr Jasiukajtis +5.72919726371765136719e-01, /* 0x3FE2555B 0xC0000000 */ 962*25c28e83SPiotr Jasiukajtis +2.71900203723740076878e-08, /* 0x3E5D31EF 0x96780876 */ 963*25c28e83SPiotr Jasiukajtis +5.81691682338714599609e-01, /* 0x3FE29D37 0xE0000000 */ 964*25c28e83SPiotr Jasiukajtis +5.72959078829112371070e-08, /* 0x3E6EC2B0 0x8AC85CD7 */ 965*25c28e83SPiotr Jasiukajtis +5.90387403964996337891e-01, /* 0x3FE2E474 0x20000000 */ 966*25c28e83SPiotr Jasiukajtis +4.26371800367512948470e-08, /* 0x3E66E402 0x68405422 */ 967*25c28e83SPiotr Jasiukajtis +5.99008142948150634766e-01, /* 0x3FE32B13 0x20000000 */ 968*25c28e83SPiotr Jasiukajtis +4.66979327646159769249e-08, /* 0x3E69121D 0x71320557 */ 969*25c28e83SPiotr Jasiukajtis +6.07555210590362548828e-01, /* 0x3FE37117 0xA0000000 */ 970*25c28e83SPiotr Jasiukajtis +3.96341792466729582847e-08, /* 0x3E654747 0xB5C5DD02 */ 971*25c28e83SPiotr Jasiukajtis +6.16029858589172363281e-01, /* 0x3FE3B684 0x40000000 */ 972*25c28e83SPiotr Jasiukajtis +1.86263416563663175432e-08, /* 0x3E53FFF8 0x455F1DBE */ 973*25c28e83SPiotr Jasiukajtis +6.24433279037475585938e-01, /* 0x3FE3FB5B 0x80000000 */ 974*25c28e83SPiotr Jasiukajtis +8.97441791510503832111e-09, /* 0x3E4345BD 0x096D3A75 */ 975*25c28e83SPiotr Jasiukajtis +6.32766664028167724609e-01, /* 0x3FE43F9F 0xE0000000 */ 976*25c28e83SPiotr Jasiukajtis +5.54287010493641158796e-09, /* 0x3E37CE73 0x3BD393DD */ 977*25c28e83SPiotr Jasiukajtis +6.41031146049499511719e-01, /* 0x3FE48353 0xC0000000 */ 978*25c28e83SPiotr Jasiukajtis +3.33714317793368531132e-08, /* 0x3E61EA88 0xDF73D5E9 */ 979*25c28e83SPiotr Jasiukajtis +6.49227917194366455078e-01, /* 0x3FE4C679 0xA0000000 */ 980*25c28e83SPiotr Jasiukajtis +2.94307433638127158696e-08, /* 0x3E5F99DC 0x7362D1DA */ 981*25c28e83SPiotr Jasiukajtis +6.57358050346374511719e-01, /* 0x3FE50913 0xC0000000 */ 982*25c28e83SPiotr Jasiukajtis +2.23619855184231409785e-08, /* 0x3E5802D0 0xD6979675 */ 983*25c28e83SPiotr Jasiukajtis +6.65422618389129638672e-01, /* 0x3FE54B24 0x60000000 */ 984*25c28e83SPiotr Jasiukajtis +1.41559608102782173188e-08, /* 0x3E4E6652 0x5EA4550A */ 985*25c28e83SPiotr Jasiukajtis +6.73422634601593017578e-01, /* 0x3FE58CAD 0xA0000000 */ 986*25c28e83SPiotr Jasiukajtis +4.06105737027198329700e-08, /* 0x3E65CD79 0x893092F2 */ 987*25c28e83SPiotr Jasiukajtis +6.81359171867370605469e-01, /* 0x3FE5CDB1 0xC0000000 */ 988*25c28e83SPiotr Jasiukajtis +5.29405324634793230630e-08, /* 0x3E6C6C17 0x648CF6E4 */ 989*25c28e83SPiotr Jasiukajtis +6.89233243465423583984e-01, /* 0x3FE60E32 0xE0000000 */ 990*25c28e83SPiotr Jasiukajtis +3.77733853963405370102e-08, /* 0x3E644788 0xD8CA7C89 */ 991*25c28e83SPiotr Jasiukajtis }; 992*25c28e83SPiotr Jasiukajtis 993*25c28e83SPiotr Jasiukajtis /* S[j],S_trail[j] = 2**(j/32.) for the final computation of exp(t+w) */ 994*25c28e83SPiotr Jasiukajtis static const double S[] = { 995*25c28e83SPiotr Jasiukajtis +1.00000000000000000000e+00, /* 3FF0000000000000 */ 996*25c28e83SPiotr Jasiukajtis +1.02189714865411662714e+00, /* 3FF059B0D3158574 */ 997*25c28e83SPiotr Jasiukajtis +1.04427378242741375480e+00, /* 3FF0B5586CF9890F */ 998*25c28e83SPiotr Jasiukajtis +1.06714040067682369717e+00, /* 3FF11301D0125B51 */ 999*25c28e83SPiotr Jasiukajtis +1.09050773266525768967e+00, /* 3FF172B83C7D517B */ 1000*25c28e83SPiotr Jasiukajtis +1.11438674259589243221e+00, /* 3FF1D4873168B9AA */ 1001*25c28e83SPiotr Jasiukajtis +1.13878863475669156458e+00, /* 3FF2387A6E756238 */ 1002*25c28e83SPiotr Jasiukajtis +1.16372485877757747552e+00, /* 3FF29E9DF51FDEE1 */ 1003*25c28e83SPiotr Jasiukajtis +1.18920711500272102690e+00, /* 3FF306FE0A31B715 */ 1004*25c28e83SPiotr Jasiukajtis +1.21524735998046895524e+00, /* 3FF371A7373AA9CB */ 1005*25c28e83SPiotr Jasiukajtis +1.24185781207348400201e+00, /* 3FF3DEA64C123422 */ 1006*25c28e83SPiotr Jasiukajtis +1.26905095719173321989e+00, /* 3FF44E086061892D */ 1007*25c28e83SPiotr Jasiukajtis +1.29683955465100964055e+00, /* 3FF4BFDAD5362A27 */ 1008*25c28e83SPiotr Jasiukajtis +1.32523664315974132322e+00, /* 3FF5342B569D4F82 */ 1009*25c28e83SPiotr Jasiukajtis +1.35425554693689265129e+00, /* 3FF5AB07DD485429 */ 1010*25c28e83SPiotr Jasiukajtis +1.38390988196383202258e+00, /* 3FF6247EB03A5585 */ 1011*25c28e83SPiotr Jasiukajtis +1.41421356237309514547e+00, /* 3FF6A09E667F3BCD */ 1012*25c28e83SPiotr Jasiukajtis +1.44518080697704665027e+00, /* 3FF71F75E8EC5F74 */ 1013*25c28e83SPiotr Jasiukajtis +1.47682614593949934623e+00, /* 3FF7A11473EB0187 */ 1014*25c28e83SPiotr Jasiukajtis +1.50916442759342284141e+00, /* 3FF82589994CCE13 */ 1015*25c28e83SPiotr Jasiukajtis +1.54221082540794074411e+00, /* 3FF8ACE5422AA0DB */ 1016*25c28e83SPiotr Jasiukajtis +1.57598084510788649659e+00, /* 3FF93737B0CDC5E5 */ 1017*25c28e83SPiotr Jasiukajtis +1.61049033194925428347e+00, /* 3FF9C49182A3F090 */ 1018*25c28e83SPiotr Jasiukajtis +1.64575547815396494578e+00, /* 3FFA5503B23E255D */ 1019*25c28e83SPiotr Jasiukajtis +1.68179283050742900407e+00, /* 3FFAE89F995AD3AD */ 1020*25c28e83SPiotr Jasiukajtis +1.71861929812247793414e+00, /* 3FFB7F76F2FB5E47 */ 1021*25c28e83SPiotr Jasiukajtis +1.75625216037329945351e+00, /* 3FFC199BDD85529C */ 1022*25c28e83SPiotr Jasiukajtis +1.79470907500310716820e+00, /* 3FFCB720DCEF9069 */ 1023*25c28e83SPiotr Jasiukajtis +1.83400808640934243066e+00, /* 3FFD5818DCFBA487 */ 1024*25c28e83SPiotr Jasiukajtis +1.87416763411029996256e+00, /* 3FFDFC97337B9B5F */ 1025*25c28e83SPiotr Jasiukajtis +1.91520656139714740007e+00, /* 3FFEA4AFA2A490DA */ 1026*25c28e83SPiotr Jasiukajtis +1.95714412417540017941e+00, /* 3FFF50765B6E4540 */ 1027*25c28e83SPiotr Jasiukajtis }; 1028*25c28e83SPiotr Jasiukajtis 1029*25c28e83SPiotr Jasiukajtis static const double S_trail[] = { 1030*25c28e83SPiotr Jasiukajtis +0.00000000000000000000e+00, 1031*25c28e83SPiotr Jasiukajtis +5.10922502897344389359e-17, /* 3C8D73E2A475B465 */ 1032*25c28e83SPiotr Jasiukajtis +8.55188970553796365958e-17, /* 3C98A62E4ADC610A */ 1033*25c28e83SPiotr Jasiukajtis -7.89985396684158212226e-17, /* BC96C51039449B3A */ 1034*25c28e83SPiotr Jasiukajtis -3.04678207981247114697e-17, /* BC819041B9D78A76 */ 1035*25c28e83SPiotr Jasiukajtis +1.04102784568455709549e-16, /* 3C9E016E00A2643C */ 1036*25c28e83SPiotr Jasiukajtis +8.91281267602540777782e-17, /* 3C99B07EB6C70573 */ 1037*25c28e83SPiotr Jasiukajtis +3.82920483692409349872e-17, /* 3C8612E8AFAD1255 */ 1038*25c28e83SPiotr Jasiukajtis +3.98201523146564611098e-17, /* 3C86F46AD23182E4 */ 1039*25c28e83SPiotr Jasiukajtis -7.71263069268148813091e-17, /* BC963AEABF42EAE2 */ 1040*25c28e83SPiotr Jasiukajtis +4.65802759183693679123e-17, /* 3C8ADA0911F09EBC */ 1041*25c28e83SPiotr Jasiukajtis +2.66793213134218609523e-18, /* 3C489B7A04EF80D0 */ 1042*25c28e83SPiotr Jasiukajtis +2.53825027948883149593e-17, /* 3C7D4397AFEC42E2 */ 1043*25c28e83SPiotr Jasiukajtis -2.85873121003886075697e-17, /* BC807ABE1DB13CAC */ 1044*25c28e83SPiotr Jasiukajtis +7.70094837980298946162e-17, /* 3C96324C054647AD */ 1045*25c28e83SPiotr Jasiukajtis -6.77051165879478628716e-17, /* BC9383C17E40B497 */ 1046*25c28e83SPiotr Jasiukajtis -9.66729331345291345105e-17, /* BC9BDD3413B26456 */ 1047*25c28e83SPiotr Jasiukajtis -3.02375813499398731940e-17, /* BC816E4786887A99 */ 1048*25c28e83SPiotr Jasiukajtis -3.48399455689279579579e-17, /* BC841577EE04992F */ 1049*25c28e83SPiotr Jasiukajtis -1.01645532775429503911e-16, /* BC9D4C1DD41532D8 */ 1050*25c28e83SPiotr Jasiukajtis +7.94983480969762085616e-17, /* 3C96E9F156864B27 */ 1051*25c28e83SPiotr Jasiukajtis -1.01369164712783039808e-17, /* BC675FC781B57EBC */ 1052*25c28e83SPiotr Jasiukajtis +2.47071925697978878522e-17, /* 3C7C7C46B071F2BE */ 1053*25c28e83SPiotr Jasiukajtis -1.01256799136747726038e-16, /* BC9D2F6EDB8D41E1 */ 1054*25c28e83SPiotr Jasiukajtis +8.19901002058149652013e-17, /* 3C97A1CD345DCC81 */ 1055*25c28e83SPiotr Jasiukajtis -1.85138041826311098821e-17, /* BC75584F7E54AC3B */ 1056*25c28e83SPiotr Jasiukajtis +2.96014069544887330703e-17, /* 3C811065895048DD */ 1057*25c28e83SPiotr Jasiukajtis +1.82274584279120867698e-17, /* 3C7503CBD1E949DB */ 1058*25c28e83SPiotr Jasiukajtis +3.28310722424562658722e-17, /* 3C82ED02D75B3706 */ 1059*25c28e83SPiotr Jasiukajtis -6.12276341300414256164e-17, /* BC91A5CD4F184B5C */ 1060*25c28e83SPiotr Jasiukajtis -1.06199460561959626376e-16, /* BC9E9C23179C2893 */ 1061*25c28e83SPiotr Jasiukajtis +8.96076779103666776760e-17, /* 3C99D3E12DD8A18B */ 1062*25c28e83SPiotr Jasiukajtis }; 1063*25c28e83SPiotr Jasiukajtis 1064*25c28e83SPiotr Jasiukajtis /* Primary interval GTi() */ 1065*25c28e83SPiotr Jasiukajtis static const double cr[] = { 1066*25c28e83SPiotr Jasiukajtis /* p1, q1 */ 1067*25c28e83SPiotr Jasiukajtis +0.70908683619977797008004927192814648151397705078125000, 1068*25c28e83SPiotr Jasiukajtis +1.71987061393048558089579513384356441668351720061e-0001, 1069*25c28e83SPiotr Jasiukajtis -3.19273345791990970293320316122813960527705450671e-0002, 1070*25c28e83SPiotr Jasiukajtis +8.36172645419110036267169600390549973563534476989e-0003, 1071*25c28e83SPiotr Jasiukajtis +1.13745336648572838333152213474277971244629758101e-0003, 1072*25c28e83SPiotr Jasiukajtis +1.0, 1073*25c28e83SPiotr Jasiukajtis +9.71980217826032937526460731778472389791321968082e-0001, 1074*25c28e83SPiotr Jasiukajtis -7.43576743326756176594084137256042653497087666030e-0002, 1075*25c28e83SPiotr Jasiukajtis -1.19345944932265559769719470515102012246995255372e-0001, 1076*25c28e83SPiotr Jasiukajtis +1.59913445751425002620935120470781382215050284762e-0002, 1077*25c28e83SPiotr Jasiukajtis +1.12601136853374984566572691306402321911547550783e-0003, 1078*25c28e83SPiotr Jasiukajtis /* p2, q2 */ 1079*25c28e83SPiotr Jasiukajtis +0.42848681585558601181418225678498856723308563232421875, 1080*25c28e83SPiotr Jasiukajtis +6.53596762668970816023718845105667418483122103629e-0002, 1081*25c28e83SPiotr Jasiukajtis -6.97280829631212931321050770925128264272768936731e-0003, 1082*25c28e83SPiotr Jasiukajtis +6.46342359021981718947208605674813260166116632899e-0003, 1083*25c28e83SPiotr Jasiukajtis +1.0, 1084*25c28e83SPiotr Jasiukajtis +4.57572620560506047062553957454062012327519313936e-0001, 1085*25c28e83SPiotr Jasiukajtis -2.52182594886075452859655003407796103083422572036e-0001, 1086*25c28e83SPiotr Jasiukajtis -1.82970945407778594681348166040103197178711552827e-0002, 1087*25c28e83SPiotr Jasiukajtis +2.43574726993169566475227642128830141304953840502e-0002, 1088*25c28e83SPiotr Jasiukajtis -5.20390406466942525358645957564897411258667085501e-0003, 1089*25c28e83SPiotr Jasiukajtis +4.79520251383279837635552431988023256031951133885e-0004, 1090*25c28e83SPiotr Jasiukajtis /* p3, q3 */ 1091*25c28e83SPiotr Jasiukajtis +0.382409479734567459008331979930517263710498809814453125, 1092*25c28e83SPiotr Jasiukajtis +1.42876048697668161599069814043449301572928034140e-0001, 1093*25c28e83SPiotr Jasiukajtis +3.42157571052250536817923866013561760785748899071e-0003, 1094*25c28e83SPiotr Jasiukajtis -5.01542621710067521405087887856991700987709272937e-0004, 1095*25c28e83SPiotr Jasiukajtis +8.89285814866740910123834688163838287618332122670e-0004, 1096*25c28e83SPiotr Jasiukajtis +1.0, 1097*25c28e83SPiotr Jasiukajtis +3.04253086629444201002215640948957897906299633168e-0001, 1098*25c28e83SPiotr Jasiukajtis -2.23162407379999477282555672834881213873185520006e-0001, 1099*25c28e83SPiotr Jasiukajtis -1.05060867741952065921809811933670131427552903636e-0002, 1100*25c28e83SPiotr Jasiukajtis +1.70511763916186982473301861980856352005926669320e-0002, 1101*25c28e83SPiotr Jasiukajtis -2.12950201683609187927899416700094630764182477464e-0003, 1102*25c28e83SPiotr Jasiukajtis }; 1103*25c28e83SPiotr Jasiukajtis 1104*25c28e83SPiotr Jasiukajtis #define P10 cr[0] 1105*25c28e83SPiotr Jasiukajtis #define P11 cr[1] 1106*25c28e83SPiotr Jasiukajtis #define P12 cr[2] 1107*25c28e83SPiotr Jasiukajtis #define P13 cr[3] 1108*25c28e83SPiotr Jasiukajtis #define P14 cr[4] 1109*25c28e83SPiotr Jasiukajtis #define Q10 cr[5] 1110*25c28e83SPiotr Jasiukajtis #define Q11 cr[6] 1111*25c28e83SPiotr Jasiukajtis #define Q12 cr[7] 1112*25c28e83SPiotr Jasiukajtis #define Q13 cr[8] 1113*25c28e83SPiotr Jasiukajtis #define Q14 cr[9] 1114*25c28e83SPiotr Jasiukajtis #define Q15 cr[10] 1115*25c28e83SPiotr Jasiukajtis #define P20 cr[11] 1116*25c28e83SPiotr Jasiukajtis #define P21 cr[12] 1117*25c28e83SPiotr Jasiukajtis #define P22 cr[13] 1118*25c28e83SPiotr Jasiukajtis #define P23 cr[14] 1119*25c28e83SPiotr Jasiukajtis #define Q20 cr[15] 1120*25c28e83SPiotr Jasiukajtis #define Q21 cr[16] 1121*25c28e83SPiotr Jasiukajtis #define Q22 cr[17] 1122*25c28e83SPiotr Jasiukajtis #define Q23 cr[18] 1123*25c28e83SPiotr Jasiukajtis #define Q24 cr[19] 1124*25c28e83SPiotr Jasiukajtis #define Q25 cr[20] 1125*25c28e83SPiotr Jasiukajtis #define Q26 cr[21] 1126*25c28e83SPiotr Jasiukajtis #define P30 cr[22] 1127*25c28e83SPiotr Jasiukajtis #define P31 cr[23] 1128*25c28e83SPiotr Jasiukajtis #define P32 cr[24] 1129*25c28e83SPiotr Jasiukajtis #define P33 cr[25] 1130*25c28e83SPiotr Jasiukajtis #define P34 cr[26] 1131*25c28e83SPiotr Jasiukajtis #define Q30 cr[27] 1132*25c28e83SPiotr Jasiukajtis #define Q31 cr[28] 1133*25c28e83SPiotr Jasiukajtis #define Q32 cr[29] 1134*25c28e83SPiotr Jasiukajtis #define Q33 cr[30] 1135*25c28e83SPiotr Jasiukajtis #define Q34 cr[31] 1136*25c28e83SPiotr Jasiukajtis #define Q35 cr[32] 1137*25c28e83SPiotr Jasiukajtis 1138*25c28e83SPiotr Jasiukajtis static const double 1139*25c28e83SPiotr Jasiukajtis GZ1_h = +0.938204627909682398190, 1140*25c28e83SPiotr Jasiukajtis GZ1_l = +5.121952600248205157935e-17, 1141*25c28e83SPiotr Jasiukajtis GZ2_h = +0.885603194410888749921, 1142*25c28e83SPiotr Jasiukajtis GZ2_l = -4.964236872556339810692e-17, 1143*25c28e83SPiotr Jasiukajtis GZ3_h = +0.936781411463652347038, 1144*25c28e83SPiotr Jasiukajtis GZ3_l = -2.541923110834479415023e-17, 1145*25c28e83SPiotr Jasiukajtis TZ1 = -0.3517214357852935791015625, 1146*25c28e83SPiotr Jasiukajtis TZ3 = +0.280530631542205810546875; 1147*25c28e83SPiotr Jasiukajtis /* INDENT ON */ 1148*25c28e83SPiotr Jasiukajtis 1149*25c28e83SPiotr Jasiukajtis /* compute gamma(y=yh+yl) for y in GT1 = [1.0000, 1.2845] */ 1150*25c28e83SPiotr Jasiukajtis /* assume yh got 20 significant bits */ 1151*25c28e83SPiotr Jasiukajtis static struct Double 1152*25c28e83SPiotr Jasiukajtis GT1(double yh, double yl) { 1153*25c28e83SPiotr Jasiukajtis double t3, t4, y, z; 1154*25c28e83SPiotr Jasiukajtis struct Double r; 1155*25c28e83SPiotr Jasiukajtis 1156*25c28e83SPiotr Jasiukajtis y = yh + yl; 1157*25c28e83SPiotr Jasiukajtis z = y * y; 1158*25c28e83SPiotr Jasiukajtis t3 = (z * (P10 + y * ((P11 + y * P12) + z * (P13 + y * P14)))) / 1159*25c28e83SPiotr Jasiukajtis (Q10 + y * ((Q11 + y * Q12) + z * ((Q13 + Q14 * y) + z * Q15))); 1160*25c28e83SPiotr Jasiukajtis t3 += (TZ1 * yl + GZ1_l); 1161*25c28e83SPiotr Jasiukajtis t4 = TZ1 * yh; 1162*25c28e83SPiotr Jasiukajtis r.h = (double) ((float) (t4 + GZ1_h + t3)); 1163*25c28e83SPiotr Jasiukajtis t3 += (t4 - (r.h - GZ1_h)); 1164*25c28e83SPiotr Jasiukajtis r.l = t3; 1165*25c28e83SPiotr Jasiukajtis return (r); 1166*25c28e83SPiotr Jasiukajtis } 1167*25c28e83SPiotr Jasiukajtis 1168*25c28e83SPiotr Jasiukajtis /* compute gamma(y=yh+yl) for y in GT2 = [1.2844, 1.6374] */ 1169*25c28e83SPiotr Jasiukajtis /* assume yh got 20 significant bits */ 1170*25c28e83SPiotr Jasiukajtis static struct Double 1171*25c28e83SPiotr Jasiukajtis GT2(double yh, double yl) { 1172*25c28e83SPiotr Jasiukajtis double t3, y, z; 1173*25c28e83SPiotr Jasiukajtis struct Double r; 1174*25c28e83SPiotr Jasiukajtis 1175*25c28e83SPiotr Jasiukajtis y = yh + yl; 1176*25c28e83SPiotr Jasiukajtis z = y * y; 1177*25c28e83SPiotr Jasiukajtis t3 = (z * (P20 + y * P21 + z * (P22 + y * P23))) / 1178*25c28e83SPiotr Jasiukajtis (Q20 + (y * ((Q21 + Q22 * y) + z * Q23) + 1179*25c28e83SPiotr Jasiukajtis (z * z) * ((Q24 + Q25 * y) + z * Q26))) + GZ2_l; 1180*25c28e83SPiotr Jasiukajtis r.h = (double) ((float) (GZ2_h + t3)); 1181*25c28e83SPiotr Jasiukajtis r.l = t3 - (r.h - GZ2_h); 1182*25c28e83SPiotr Jasiukajtis return (r); 1183*25c28e83SPiotr Jasiukajtis } 1184*25c28e83SPiotr Jasiukajtis 1185*25c28e83SPiotr Jasiukajtis /* compute gamma(y=yh+yl) for y in GT3 = [1.6373, 2.0000] */ 1186*25c28e83SPiotr Jasiukajtis /* assume yh got 20 significant bits */ 1187*25c28e83SPiotr Jasiukajtis static struct Double 1188*25c28e83SPiotr Jasiukajtis GT3(double yh, double yl) { 1189*25c28e83SPiotr Jasiukajtis double t3, t4, y, z; 1190*25c28e83SPiotr Jasiukajtis struct Double r; 1191*25c28e83SPiotr Jasiukajtis 1192*25c28e83SPiotr Jasiukajtis y = yh + yl; 1193*25c28e83SPiotr Jasiukajtis z = y * y; 1194*25c28e83SPiotr Jasiukajtis t3 = (z * (P30 + y * ((P31 + y * P32) + z * (P33 + y * P34)))) / 1195*25c28e83SPiotr Jasiukajtis (Q30 + y * ((Q31 + y * Q32) + z * ((Q33 + Q34 * y) + z * Q35))); 1196*25c28e83SPiotr Jasiukajtis t3 += (TZ3 * yl + GZ3_l); 1197*25c28e83SPiotr Jasiukajtis t4 = TZ3 * yh; 1198*25c28e83SPiotr Jasiukajtis r.h = (double) ((float) (t4 + GZ3_h + t3)); 1199*25c28e83SPiotr Jasiukajtis t3 += (t4 - (r.h - GZ3_h)); 1200*25c28e83SPiotr Jasiukajtis r.l = t3; 1201*25c28e83SPiotr Jasiukajtis return (r); 1202*25c28e83SPiotr Jasiukajtis } 1203*25c28e83SPiotr Jasiukajtis 1204*25c28e83SPiotr Jasiukajtis /* INDENT OFF */ 1205*25c28e83SPiotr Jasiukajtis /* 1206*25c28e83SPiotr Jasiukajtis * return tgamma(x) scaled by 2**-m for 8<x<=171.62... using Stirling's formula 1207*25c28e83SPiotr Jasiukajtis * log(G(x)) ~= (x-.5)*(log(x)-1) + .5(log(2*pi)-1) + (1/x)*P(1/(x*x)) 1208*25c28e83SPiotr Jasiukajtis * = L1 + L2 + L3, 1209*25c28e83SPiotr Jasiukajtis */ 1210*25c28e83SPiotr Jasiukajtis /* INDENT ON */ 1211*25c28e83SPiotr Jasiukajtis static struct Double 1212*25c28e83SPiotr Jasiukajtis large_gam(double x, int *m) { 1213*25c28e83SPiotr Jasiukajtis double z, t1, t2, t3, z2, t5, w, y, u, r, z4, v, t24 = 16777216.0, 1214*25c28e83SPiotr Jasiukajtis p24 = 1.0 / 16777216.0; 1215*25c28e83SPiotr Jasiukajtis int n2, j2, k, ix, j; 1216*25c28e83SPiotr Jasiukajtis unsigned lx; 1217*25c28e83SPiotr Jasiukajtis struct Double zz; 1218*25c28e83SPiotr Jasiukajtis double u2, ss_h, ss_l, r_h, w_h, w_l, t4; 1219*25c28e83SPiotr Jasiukajtis 1220*25c28e83SPiotr Jasiukajtis /* INDENT OFF */ 1221*25c28e83SPiotr Jasiukajtis /* 1222*25c28e83SPiotr Jasiukajtis * compute ss = ss.h+ss.l = log(x)-1 (see tgamma_log.h for details) 1223*25c28e83SPiotr Jasiukajtis * 1224*25c28e83SPiotr Jasiukajtis * log(x) - 1 = T1(n) + T2(j) + T3(s), where x = 2**n * y, 1<=y<2, 1225*25c28e83SPiotr Jasiukajtis * j=[64*y], z[j]=1+j/64+1/128, s = (y-z[j])/(y+z[j]), and 1226*25c28e83SPiotr Jasiukajtis * T1(n) = T1[2n,2n+1] = n*log(2)-1, 1227*25c28e83SPiotr Jasiukajtis * T2(j) = T2[2j,2j+1] = log(z[j]), 1228*25c28e83SPiotr Jasiukajtis * T3(s) = 2s + A1[0]s^3 + A2[1]s^5 + A3[2]s^7 1229*25c28e83SPiotr Jasiukajtis * Note 1230*25c28e83SPiotr Jasiukajtis * (1) the leading entries are truncated to 24 binary point. 1231*25c28e83SPiotr Jasiukajtis * (2) Remez error for T3(s) is bounded by 2**(-72.4) 1232*25c28e83SPiotr Jasiukajtis * 2**(-24) 1233*25c28e83SPiotr Jasiukajtis * _________V___________________ 1234*25c28e83SPiotr Jasiukajtis * T1(n): |_________|___________________| 1235*25c28e83SPiotr Jasiukajtis * _______ ______________________ 1236*25c28e83SPiotr Jasiukajtis * T2(j): |_______|______________________| 1237*25c28e83SPiotr Jasiukajtis * ____ _______________________ 1238*25c28e83SPiotr Jasiukajtis * 2s: |____|_______________________| 1239*25c28e83SPiotr Jasiukajtis * __________________________ 1240*25c28e83SPiotr Jasiukajtis * + T3(s)-2s: |__________________________| 1241*25c28e83SPiotr Jasiukajtis * ------------------------------------------- 1242*25c28e83SPiotr Jasiukajtis * [leading] + [Trailing] 1243*25c28e83SPiotr Jasiukajtis */ 1244*25c28e83SPiotr Jasiukajtis /* INDENT ON */ 1245*25c28e83SPiotr Jasiukajtis ix = __HI(x); 1246*25c28e83SPiotr Jasiukajtis lx = __LO(x); 1247*25c28e83SPiotr Jasiukajtis n2 = (ix >> 20) - 0x3ff; /* exponent of x, range:3-7 */ 1248*25c28e83SPiotr Jasiukajtis n2 += n2; /* 2n */ 1249*25c28e83SPiotr Jasiukajtis ix = (ix & 0x000fffff) | 0x3ff00000; /* y = scale x to [1,2] */ 1250*25c28e83SPiotr Jasiukajtis __HI(y) = ix; 1251*25c28e83SPiotr Jasiukajtis __LO(y) = lx; 1252*25c28e83SPiotr Jasiukajtis __HI(z) = (ix & 0xffffc000) | 0x2000; /* z[j]=1+j/64+1/128 */ 1253*25c28e83SPiotr Jasiukajtis __LO(z) = 0; 1254*25c28e83SPiotr Jasiukajtis j2 = (ix >> 13) & 0x7e; /* 2j */ 1255*25c28e83SPiotr Jasiukajtis t1 = y + z; 1256*25c28e83SPiotr Jasiukajtis t2 = y - z; 1257*25c28e83SPiotr Jasiukajtis r = one / t1; 1258*25c28e83SPiotr Jasiukajtis t1 = (double) ((float) t1); 1259*25c28e83SPiotr Jasiukajtis u = r * t2; /* u = (y-z)/(y+z) */ 1260*25c28e83SPiotr Jasiukajtis t4 = T2[j2 + 1] + T1[n2 + 1]; 1261*25c28e83SPiotr Jasiukajtis z2 = u * u; 1262*25c28e83SPiotr Jasiukajtis k = __HI(u) & 0x7fffffff; 1263*25c28e83SPiotr Jasiukajtis t3 = T2[j2] + T1[n2]; 1264*25c28e83SPiotr Jasiukajtis if ((k >> 20) < 0x3ec) { /* |u|<2**-19 */ 1265*25c28e83SPiotr Jasiukajtis t2 = t4 + u * ((two + z2 * A1) + (z2 * z2) * (A2 + z2 * A3)); 1266*25c28e83SPiotr Jasiukajtis } else { 1267*25c28e83SPiotr Jasiukajtis t5 = t4 + u * (z2 * A1 + (z2 * z2) * (A2 + z2 * A3)); 1268*25c28e83SPiotr Jasiukajtis u2 = u + u; 1269*25c28e83SPiotr Jasiukajtis v = (double) ((int) (u2 * t24)) * p24; 1270*25c28e83SPiotr Jasiukajtis t2 = t5 + r * ((two * t2 - v * t1) - v * (y - (t1 - z))); 1271*25c28e83SPiotr Jasiukajtis t3 += v; 1272*25c28e83SPiotr Jasiukajtis } 1273*25c28e83SPiotr Jasiukajtis ss_h = (double) ((float) (t2 + t3)); 1274*25c28e83SPiotr Jasiukajtis ss_l = t2 - (ss_h - t3); 1275*25c28e83SPiotr Jasiukajtis 1276*25c28e83SPiotr Jasiukajtis /* 1277*25c28e83SPiotr Jasiukajtis * compute ww = (x-.5)*(log(x)-1) + .5*(log(2pi)-1) + 1/x*(P(1/x^2))) 1278*25c28e83SPiotr Jasiukajtis * where ss = log(x) - 1 in already in extra precision 1279*25c28e83SPiotr Jasiukajtis */ 1280*25c28e83SPiotr Jasiukajtis z = one / x; 1281*25c28e83SPiotr Jasiukajtis r = x - half; 1282*25c28e83SPiotr Jasiukajtis r_h = (double) ((float) r); 1283*25c28e83SPiotr Jasiukajtis w_h = r_h * ss_h + hln2pi_h; 1284*25c28e83SPiotr Jasiukajtis z2 = z * z; 1285*25c28e83SPiotr Jasiukajtis w = (r - r_h) * ss_h + r * ss_l; 1286*25c28e83SPiotr Jasiukajtis z4 = z2 * z2; 1287*25c28e83SPiotr Jasiukajtis t1 = z2 * (GP1 + z4 * (GP3 + z4 * (GP5 + z4 * GP7))); 1288*25c28e83SPiotr Jasiukajtis t2 = z4 * (GP2 + z4 * (GP4 + z4 * GP6)); 1289*25c28e83SPiotr Jasiukajtis t1 += t2; 1290*25c28e83SPiotr Jasiukajtis w += hln2pi_l; 1291*25c28e83SPiotr Jasiukajtis w_l = z * (GP0 + t1) + w; 1292*25c28e83SPiotr Jasiukajtis k = (int) ((w_h + w_l) * invln2_32 + half); 1293*25c28e83SPiotr Jasiukajtis 1294*25c28e83SPiotr Jasiukajtis /* compute the exponential of w_h+w_l */ 1295*25c28e83SPiotr Jasiukajtis j = k & 0x1f; 1296*25c28e83SPiotr Jasiukajtis *m = (k >> 5); 1297*25c28e83SPiotr Jasiukajtis t3 = (double) k; 1298*25c28e83SPiotr Jasiukajtis 1299*25c28e83SPiotr Jasiukajtis /* perform w - k*ln2_32 (represent as w_h - w_l) */ 1300*25c28e83SPiotr Jasiukajtis t1 = w_h - t3 * ln2_32hi; 1301*25c28e83SPiotr Jasiukajtis t2 = t3 * ln2_32lo; 1302*25c28e83SPiotr Jasiukajtis w = w_l - t2; 1303*25c28e83SPiotr Jasiukajtis w_h = t1 + w_l; 1304*25c28e83SPiotr Jasiukajtis w_l = t2 - (w_l - (w_h - t1)); 1305*25c28e83SPiotr Jasiukajtis 1306*25c28e83SPiotr Jasiukajtis /* compute exp(w_h+w_l) */ 1307*25c28e83SPiotr Jasiukajtis z = w_h - w_l; 1308*25c28e83SPiotr Jasiukajtis z2 = z * z; 1309*25c28e83SPiotr Jasiukajtis t1 = z2 * (Et1 + z2 * (Et3 + z2 * Et5)); 1310*25c28e83SPiotr Jasiukajtis t2 = z2 * (Et2 + z2 * Et4); 1311*25c28e83SPiotr Jasiukajtis t3 = w_h - (w_l - (t1 + z * t2)); 1312*25c28e83SPiotr Jasiukajtis zz.l = S_trail[j] * (one + t3) + S[j] * t3; 1313*25c28e83SPiotr Jasiukajtis zz.h = S[j]; 1314*25c28e83SPiotr Jasiukajtis return (zz); 1315*25c28e83SPiotr Jasiukajtis } 1316*25c28e83SPiotr Jasiukajtis 1317*25c28e83SPiotr Jasiukajtis /* INDENT OFF */ 1318*25c28e83SPiotr Jasiukajtis /* 1319*25c28e83SPiotr Jasiukajtis * kpsin(x)= sin(pi*x)/pi 1320*25c28e83SPiotr Jasiukajtis * 3 5 7 9 11 13 15 1321*25c28e83SPiotr Jasiukajtis * = x+ks[0]*x +ks[1]*x +ks[2]*x +ks[3]*x +ks[4]*x +ks[5]*x +ks[6]*x 1322*25c28e83SPiotr Jasiukajtis */ 1323*25c28e83SPiotr Jasiukajtis static const double ks[] = { 1324*25c28e83SPiotr Jasiukajtis -1.64493406684822640606569, 1325*25c28e83SPiotr Jasiukajtis +8.11742425283341655883668741874008920850698590621e-0001, 1326*25c28e83SPiotr Jasiukajtis -1.90751824120862873825597279118304943994042258291e-0001, 1327*25c28e83SPiotr Jasiukajtis +2.61478477632554278317289628332654539353521911570e-0002, 1328*25c28e83SPiotr Jasiukajtis -2.34607978510202710377617190278735525354347705866e-0003, 1329*25c28e83SPiotr Jasiukajtis +1.48413292290051695897242899977121846763824221705e-0004, 1330*25c28e83SPiotr Jasiukajtis -6.87730769637543488108688726777687262485357072242e-0006, 1331*25c28e83SPiotr Jasiukajtis }; 1332*25c28e83SPiotr Jasiukajtis /* INDENT ON */ 1333*25c28e83SPiotr Jasiukajtis 1334*25c28e83SPiotr Jasiukajtis /* assume x is not tiny and positive */ 1335*25c28e83SPiotr Jasiukajtis static struct Double 1336*25c28e83SPiotr Jasiukajtis kpsin(double x) { 1337*25c28e83SPiotr Jasiukajtis double z, t1, t2, t3, t4; 1338*25c28e83SPiotr Jasiukajtis struct Double xx; 1339*25c28e83SPiotr Jasiukajtis 1340*25c28e83SPiotr Jasiukajtis z = x * x; 1341*25c28e83SPiotr Jasiukajtis xx.h = x; 1342*25c28e83SPiotr Jasiukajtis t1 = z * x; 1343*25c28e83SPiotr Jasiukajtis t2 = z * z; 1344*25c28e83SPiotr Jasiukajtis t4 = t1 * ks[0]; 1345*25c28e83SPiotr Jasiukajtis t3 = (t1 * z) * ((ks[1] + z * ks[2] + t2 * ks[3]) + (z * t2) * 1346*25c28e83SPiotr Jasiukajtis (ks[4] + z * ks[5] + t2 * ks[6])); 1347*25c28e83SPiotr Jasiukajtis xx.l = t4 + t3; 1348*25c28e83SPiotr Jasiukajtis return (xx); 1349*25c28e83SPiotr Jasiukajtis } 1350*25c28e83SPiotr Jasiukajtis 1351*25c28e83SPiotr Jasiukajtis /* INDENT OFF */ 1352*25c28e83SPiotr Jasiukajtis /* 1353*25c28e83SPiotr Jasiukajtis * kpcos(x)= cos(pi*x)/pi 1354*25c28e83SPiotr Jasiukajtis * 2 4 6 8 10 12 1355*25c28e83SPiotr Jasiukajtis * = 1/pi +kc[0]*x +kc[1]*x +kc[2]*x +kc[3]*x +kc[4]*x +kc[5]*x 1356*25c28e83SPiotr Jasiukajtis */ 1357*25c28e83SPiotr Jasiukajtis 1358*25c28e83SPiotr Jasiukajtis static const double one_pi_h = 0.318309886183790635705292970, 1359*25c28e83SPiotr Jasiukajtis one_pi_l = 3.583247455607534006714276420e-17; 1360*25c28e83SPiotr Jasiukajtis static const double npi_2_h = -1.5625, 1361*25c28e83SPiotr Jasiukajtis npi_2_l = -0.00829632679489661923132169163975055099555883223; 1362*25c28e83SPiotr Jasiukajtis static const double kc[] = { 1363*25c28e83SPiotr Jasiukajtis -1.57079632679489661923132169163975055099555883223e+0000, 1364*25c28e83SPiotr Jasiukajtis +1.29192819501230224953283586722575766189551966008e+0000, 1365*25c28e83SPiotr Jasiukajtis -4.25027339940149518500158850753393173519732149213e-0001, 1366*25c28e83SPiotr Jasiukajtis +7.49080625187015312373925142219429422375556727752e-0002, 1367*25c28e83SPiotr Jasiukajtis -8.21442040906099210866977352284054849051348692715e-0003, 1368*25c28e83SPiotr Jasiukajtis +6.10411356829515414575566564733632532333904115968e-0004, 1369*25c28e83SPiotr Jasiukajtis }; 1370*25c28e83SPiotr Jasiukajtis /* INDENT ON */ 1371*25c28e83SPiotr Jasiukajtis 1372*25c28e83SPiotr Jasiukajtis /* assume x is not tiny and positive */ 1373*25c28e83SPiotr Jasiukajtis static struct Double 1374*25c28e83SPiotr Jasiukajtis kpcos(double x) { 1375*25c28e83SPiotr Jasiukajtis double z, t1, t2, t3, t4, x4, x8; 1376*25c28e83SPiotr Jasiukajtis struct Double xx; 1377*25c28e83SPiotr Jasiukajtis 1378*25c28e83SPiotr Jasiukajtis z = x * x; 1379*25c28e83SPiotr Jasiukajtis xx.h = one_pi_h; 1380*25c28e83SPiotr Jasiukajtis t1 = (double) ((float) x); 1381*25c28e83SPiotr Jasiukajtis x4 = z * z; 1382*25c28e83SPiotr Jasiukajtis t2 = npi_2_l * z + npi_2_h * (x + t1) * (x - t1); 1383*25c28e83SPiotr Jasiukajtis t3 = one_pi_l + x4 * ((kc[1] + z * kc[2]) + x4 * (kc[3] + z * 1384*25c28e83SPiotr Jasiukajtis kc[4] + x4 * kc[5])); 1385*25c28e83SPiotr Jasiukajtis t4 = t1 * t1; /* 48 bits mantissa */ 1386*25c28e83SPiotr Jasiukajtis x8 = t2 + t3; 1387*25c28e83SPiotr Jasiukajtis t4 *= npi_2_h; /* npi_2_h is 5 bits const. The product is exact */ 1388*25c28e83SPiotr Jasiukajtis xx.l = x8 + t4; /* that will minimized the rounding error in xx.l */ 1389*25c28e83SPiotr Jasiukajtis return (xx); 1390*25c28e83SPiotr Jasiukajtis } 1391*25c28e83SPiotr Jasiukajtis 1392*25c28e83SPiotr Jasiukajtis /* INDENT OFF */ 1393*25c28e83SPiotr Jasiukajtis static const double 1394*25c28e83SPiotr Jasiukajtis /* 0.134861805732790769689793935774652917006 */ 1395*25c28e83SPiotr Jasiukajtis t0z1 = 0.1348618057327907737708, 1396*25c28e83SPiotr Jasiukajtis t0z1_l = -4.0810077708578299022531e-18, 1397*25c28e83SPiotr Jasiukajtis /* 0.461632144968362341262659542325721328468 */ 1398*25c28e83SPiotr Jasiukajtis t0z2 = 0.4616321449683623567850, 1399*25c28e83SPiotr Jasiukajtis t0z2_l = -1.5522348162858676890521e-17, 1400*25c28e83SPiotr Jasiukajtis /* 0.819773101100500601787868704921606996312 */ 1401*25c28e83SPiotr Jasiukajtis t0z3 = 0.8197731011005006118708, 1402*25c28e83SPiotr Jasiukajtis t0z3_l = -1.0082945122487103498325e-17; 1403*25c28e83SPiotr Jasiukajtis /* 1.134861805732790769689793935774652917006 */ 1404*25c28e83SPiotr Jasiukajtis /* INDENT ON */ 1405*25c28e83SPiotr Jasiukajtis 1406*25c28e83SPiotr Jasiukajtis /* gamma(x+i) for 0 <= x < 1 */ 1407*25c28e83SPiotr Jasiukajtis static struct Double 1408*25c28e83SPiotr Jasiukajtis gam_n(int i, double x) { 1409*25c28e83SPiotr Jasiukajtis struct Double rr = {0.0L, 0.0L}, yy; 1410*25c28e83SPiotr Jasiukajtis double r1, r2, t2, z, xh, xl, yh, yl, zh, z1, z2, zl, x5, wh, wl; 1411*25c28e83SPiotr Jasiukajtis 1412*25c28e83SPiotr Jasiukajtis /* compute yy = gamma(x+1) */ 1413*25c28e83SPiotr Jasiukajtis if (x > 0.2845) { 1414*25c28e83SPiotr Jasiukajtis if (x > 0.6374) { 1415*25c28e83SPiotr Jasiukajtis r1 = x - t0z3; 1416*25c28e83SPiotr Jasiukajtis r2 = (double) ((float) (r1 - t0z3_l)); 1417*25c28e83SPiotr Jasiukajtis t2 = r1 - r2; 1418*25c28e83SPiotr Jasiukajtis yy = GT3(r2, t2 - t0z3_l); 1419*25c28e83SPiotr Jasiukajtis } else { 1420*25c28e83SPiotr Jasiukajtis r1 = x - t0z2; 1421*25c28e83SPiotr Jasiukajtis r2 = (double) ((float) (r1 - t0z2_l)); 1422*25c28e83SPiotr Jasiukajtis t2 = r1 - r2; 1423*25c28e83SPiotr Jasiukajtis yy = GT2(r2, t2 - t0z2_l); 1424*25c28e83SPiotr Jasiukajtis } 1425*25c28e83SPiotr Jasiukajtis } else { 1426*25c28e83SPiotr Jasiukajtis r1 = x - t0z1; 1427*25c28e83SPiotr Jasiukajtis r2 = (double) ((float) (r1 - t0z1_l)); 1428*25c28e83SPiotr Jasiukajtis t2 = r1 - r2; 1429*25c28e83SPiotr Jasiukajtis yy = GT1(r2, t2 - t0z1_l); 1430*25c28e83SPiotr Jasiukajtis } 1431*25c28e83SPiotr Jasiukajtis 1432*25c28e83SPiotr Jasiukajtis /* compute gamma(x+i) = (x+i-1)*...*(x+1)*yy, 0<i<8 */ 1433*25c28e83SPiotr Jasiukajtis switch (i) { 1434*25c28e83SPiotr Jasiukajtis case 0: /* yy/x */ 1435*25c28e83SPiotr Jasiukajtis r1 = one / x; 1436*25c28e83SPiotr Jasiukajtis xh = (double) ((float) x); /* x is not tiny */ 1437*25c28e83SPiotr Jasiukajtis rr.h = (double) ((float) ((yy.h + yy.l) * r1)); 1438*25c28e83SPiotr Jasiukajtis rr.l = r1 * (yy.h - rr.h * xh) - 1439*25c28e83SPiotr Jasiukajtis ((r1 * rr.h) * (x - xh) - r1 * yy.l); 1440*25c28e83SPiotr Jasiukajtis break; 1441*25c28e83SPiotr Jasiukajtis case 1: /* yy */ 1442*25c28e83SPiotr Jasiukajtis rr.h = yy.h; 1443*25c28e83SPiotr Jasiukajtis rr.l = yy.l; 1444*25c28e83SPiotr Jasiukajtis break; 1445*25c28e83SPiotr Jasiukajtis case 2: /* (x+1)*yy */ 1446*25c28e83SPiotr Jasiukajtis z = x + one; /* may not be exact */ 1447*25c28e83SPiotr Jasiukajtis zh = (double) ((float) z); 1448*25c28e83SPiotr Jasiukajtis rr.h = zh * yy.h; 1449*25c28e83SPiotr Jasiukajtis rr.l = z * yy.l + (x - (zh - one)) * yy.h; 1450*25c28e83SPiotr Jasiukajtis break; 1451*25c28e83SPiotr Jasiukajtis case 3: /* (x+2)*(x+1)*yy */ 1452*25c28e83SPiotr Jasiukajtis z1 = x + one; 1453*25c28e83SPiotr Jasiukajtis z2 = x + 2.0; 1454*25c28e83SPiotr Jasiukajtis z = z1 * z2; 1455*25c28e83SPiotr Jasiukajtis xh = (double) ((float) z); 1456*25c28e83SPiotr Jasiukajtis zh = (double) ((float) z1); 1457*25c28e83SPiotr Jasiukajtis xl = (x - (zh - one)) * (z2 + zh) - (xh - zh * (zh + one)); 1458*25c28e83SPiotr Jasiukajtis rr.h = xh * yy.h; 1459*25c28e83SPiotr Jasiukajtis rr.l = z * yy.l + xl * yy.h; 1460*25c28e83SPiotr Jasiukajtis break; 1461*25c28e83SPiotr Jasiukajtis 1462*25c28e83SPiotr Jasiukajtis case 4: /* (x+1)*(x+3)*(x+2)*yy */ 1463*25c28e83SPiotr Jasiukajtis z1 = x + 2.0; 1464*25c28e83SPiotr Jasiukajtis z2 = (x + one) * (x + 3.0); 1465*25c28e83SPiotr Jasiukajtis zh = z1; 1466*25c28e83SPiotr Jasiukajtis __LO(zh) = 0; 1467*25c28e83SPiotr Jasiukajtis __HI(zh) &= 0xfffffff8; /* zh 18 bits mantissa */ 1468*25c28e83SPiotr Jasiukajtis zl = x - (zh - 2.0); 1469*25c28e83SPiotr Jasiukajtis z = z1 * z2; 1470*25c28e83SPiotr Jasiukajtis xh = (double) ((float) z); 1471*25c28e83SPiotr Jasiukajtis xl = zl * (z2 + zh * (z1 + zh)) - (xh - zh * (zh * zh - one)); 1472*25c28e83SPiotr Jasiukajtis rr.h = xh * yy.h; 1473*25c28e83SPiotr Jasiukajtis rr.l = z * yy.l + xl * yy.h; 1474*25c28e83SPiotr Jasiukajtis break; 1475*25c28e83SPiotr Jasiukajtis case 5: /* ((x+1)*(x+4)*(x+2)*(x+3))*yy */ 1476*25c28e83SPiotr Jasiukajtis z1 = x + 2.0; 1477*25c28e83SPiotr Jasiukajtis z2 = x + 3.0; 1478*25c28e83SPiotr Jasiukajtis z = z1 * z2; 1479*25c28e83SPiotr Jasiukajtis zh = (double) ((float) z1); 1480*25c28e83SPiotr Jasiukajtis yh = (double) ((float) z); 1481*25c28e83SPiotr Jasiukajtis yl = (x - (zh - 2.0)) * (z2 + zh) - (yh - zh * (zh + one)); 1482*25c28e83SPiotr Jasiukajtis z2 = z - 2.0; 1483*25c28e83SPiotr Jasiukajtis z *= z2; 1484*25c28e83SPiotr Jasiukajtis xh = (double) ((float) z); 1485*25c28e83SPiotr Jasiukajtis xl = yl * (z2 + yh) - (xh - yh * (yh - 2.0)); 1486*25c28e83SPiotr Jasiukajtis rr.h = xh * yy.h; 1487*25c28e83SPiotr Jasiukajtis rr.l = z * yy.l + xl * yy.h; 1488*25c28e83SPiotr Jasiukajtis break; 1489*25c28e83SPiotr Jasiukajtis case 6: /* ((x+1)*(x+2)*(x+3)*(x+4)*(x+5))*yy */ 1490*25c28e83SPiotr Jasiukajtis z1 = x + 2.0; 1491*25c28e83SPiotr Jasiukajtis z2 = x + 3.0; 1492*25c28e83SPiotr Jasiukajtis z = z1 * z2; 1493*25c28e83SPiotr Jasiukajtis zh = (double) ((float) z1); 1494*25c28e83SPiotr Jasiukajtis yh = (double) ((float) z); 1495*25c28e83SPiotr Jasiukajtis z1 = x - (zh - 2.0); 1496*25c28e83SPiotr Jasiukajtis yl = z1 * (z2 + zh) - (yh - zh * (zh + one)); 1497*25c28e83SPiotr Jasiukajtis z2 = z - 2.0; 1498*25c28e83SPiotr Jasiukajtis x5 = x + 5.0; 1499*25c28e83SPiotr Jasiukajtis z *= z2; 1500*25c28e83SPiotr Jasiukajtis xh = (double) ((float) z); 1501*25c28e83SPiotr Jasiukajtis zh += 3.0; 1502*25c28e83SPiotr Jasiukajtis xl = yl * (z2 + yh) - (xh - yh * (yh - 2.0)); 1503*25c28e83SPiotr Jasiukajtis /* xh+xl=(x+1)*...*(x+4) */ 1504*25c28e83SPiotr Jasiukajtis /* wh+wl=(x+5)*yy */ 1505*25c28e83SPiotr Jasiukajtis wh = (double) ((float) (x5 * (yy.h + yy.l))); 1506*25c28e83SPiotr Jasiukajtis wl = (z1 * yy.h + x5 * yy.l) - (wh - zh * yy.h); 1507*25c28e83SPiotr Jasiukajtis rr.h = wh * xh; 1508*25c28e83SPiotr Jasiukajtis rr.l = z * wl + xl * wh; 1509*25c28e83SPiotr Jasiukajtis break; 1510*25c28e83SPiotr Jasiukajtis case 7: /* ((x+1)*(x+2)*(x+3)*(x+4)*(x+5)*(x+6))*yy */ 1511*25c28e83SPiotr Jasiukajtis z1 = x + 3.0; 1512*25c28e83SPiotr Jasiukajtis z2 = x + 4.0; 1513*25c28e83SPiotr Jasiukajtis z = z2 * z1; 1514*25c28e83SPiotr Jasiukajtis zh = (double) ((float) z1); 1515*25c28e83SPiotr Jasiukajtis yh = (double) ((float) z); /* yh+yl = (x+3)(x+4) */ 1516*25c28e83SPiotr Jasiukajtis yl = (x - (zh - 3.0)) * (z2 + zh) - (yh - (zh * (zh + one))); 1517*25c28e83SPiotr Jasiukajtis z1 = x + 6.0; 1518*25c28e83SPiotr Jasiukajtis z2 = z - 2.0; /* z2 = (x+2)*(x+5) */ 1519*25c28e83SPiotr Jasiukajtis z *= z2; 1520*25c28e83SPiotr Jasiukajtis xh = (double) ((float) z); 1521*25c28e83SPiotr Jasiukajtis xl = yl * (z2 + yh) - (xh - yh * (yh - 2.0)); 1522*25c28e83SPiotr Jasiukajtis /* xh+xl=(x+2)*...*(x+5) */ 1523*25c28e83SPiotr Jasiukajtis /* wh+wl=(x+1)(x+6)*yy */ 1524*25c28e83SPiotr Jasiukajtis z2 -= 4.0; /* z2 = (x+1)(x+6) */ 1525*25c28e83SPiotr Jasiukajtis wh = (double) ((float) (z2 * (yy.h + yy.l))); 1526*25c28e83SPiotr Jasiukajtis wl = (z2 * yy.l + yl * yy.h) - (wh - (yh - 6.0) * yy.h); 1527*25c28e83SPiotr Jasiukajtis rr.h = wh * xh; 1528*25c28e83SPiotr Jasiukajtis rr.l = z * wl + xl * wh; 1529*25c28e83SPiotr Jasiukajtis } 1530*25c28e83SPiotr Jasiukajtis return (rr); 1531*25c28e83SPiotr Jasiukajtis } 1532*25c28e83SPiotr Jasiukajtis 1533*25c28e83SPiotr Jasiukajtis double 1534*25c28e83SPiotr Jasiukajtis tgamma(double x) { 1535*25c28e83SPiotr Jasiukajtis struct Double ss, ww; 1536*25c28e83SPiotr Jasiukajtis double t, t1, t2, t3, t4, t5, w, y, z, z1, z2, z3, z5; 1537*25c28e83SPiotr Jasiukajtis int i, j, k, m, ix, hx, xk; 1538*25c28e83SPiotr Jasiukajtis unsigned lx; 1539*25c28e83SPiotr Jasiukajtis 1540*25c28e83SPiotr Jasiukajtis hx = __HI(x); 1541*25c28e83SPiotr Jasiukajtis lx = __LO(x); 1542*25c28e83SPiotr Jasiukajtis ix = hx & 0x7fffffff; 1543*25c28e83SPiotr Jasiukajtis y = x; 1544*25c28e83SPiotr Jasiukajtis 1545*25c28e83SPiotr Jasiukajtis if (ix < 0x3ca00000) 1546*25c28e83SPiotr Jasiukajtis return (one / x); /* |x| < 2**-53 */ 1547*25c28e83SPiotr Jasiukajtis if (ix >= 0x7ff00000) 1548*25c28e83SPiotr Jasiukajtis /* +Inf -> +Inf, -Inf or NaN -> NaN */ 1549*25c28e83SPiotr Jasiukajtis return (x * ((hx < 0)? 0.0 : x)); 1550*25c28e83SPiotr Jasiukajtis if (hx > 0x406573fa || /* x > 171.62... overflow to +inf */ 1551*25c28e83SPiotr Jasiukajtis (hx == 0x406573fa && lx > 0xE561F647)) { 1552*25c28e83SPiotr Jasiukajtis z = x / tiny; 1553*25c28e83SPiotr Jasiukajtis return (z * z); 1554*25c28e83SPiotr Jasiukajtis } 1555*25c28e83SPiotr Jasiukajtis if (hx >= 0x40200000) { /* x >= 8 */ 1556*25c28e83SPiotr Jasiukajtis ww = large_gam(x, &m); 1557*25c28e83SPiotr Jasiukajtis w = ww.h + ww.l; 1558*25c28e83SPiotr Jasiukajtis __HI(w) += m << 20; 1559*25c28e83SPiotr Jasiukajtis return (w); 1560*25c28e83SPiotr Jasiukajtis } 1561*25c28e83SPiotr Jasiukajtis if (hx > 0) { /* 0 < x < 8 */ 1562*25c28e83SPiotr Jasiukajtis i = (int) x; 1563*25c28e83SPiotr Jasiukajtis ww = gam_n(i, x - (double) i); 1564*25c28e83SPiotr Jasiukajtis return (ww.h + ww.l); 1565*25c28e83SPiotr Jasiukajtis } 1566*25c28e83SPiotr Jasiukajtis 1567*25c28e83SPiotr Jasiukajtis /* negative x */ 1568*25c28e83SPiotr Jasiukajtis /* INDENT OFF */ 1569*25c28e83SPiotr Jasiukajtis /* 1570*25c28e83SPiotr Jasiukajtis * compute: xk = 1571*25c28e83SPiotr Jasiukajtis * -2 ... x is an even int (-inf is even) 1572*25c28e83SPiotr Jasiukajtis * -1 ... x is an odd int 1573*25c28e83SPiotr Jasiukajtis * +0 ... x is not an int but chopped to an even int 1574*25c28e83SPiotr Jasiukajtis * +1 ... x is not an int but chopped to an odd int 1575*25c28e83SPiotr Jasiukajtis */ 1576*25c28e83SPiotr Jasiukajtis /* INDENT ON */ 1577*25c28e83SPiotr Jasiukajtis xk = 0; 1578*25c28e83SPiotr Jasiukajtis if (ix >= 0x43300000) { 1579*25c28e83SPiotr Jasiukajtis if (ix >= 0x43400000) 1580*25c28e83SPiotr Jasiukajtis xk = -2; 1581*25c28e83SPiotr Jasiukajtis else 1582*25c28e83SPiotr Jasiukajtis xk = -2 + (lx & 1); 1583*25c28e83SPiotr Jasiukajtis } else if (ix >= 0x3ff00000) { 1584*25c28e83SPiotr Jasiukajtis k = (ix >> 20) - 0x3ff; 1585*25c28e83SPiotr Jasiukajtis if (k > 20) { 1586*25c28e83SPiotr Jasiukajtis j = lx >> (52 - k); 1587*25c28e83SPiotr Jasiukajtis if ((j << (52 - k)) == lx) 1588*25c28e83SPiotr Jasiukajtis xk = -2 + (j & 1); 1589*25c28e83SPiotr Jasiukajtis else 1590*25c28e83SPiotr Jasiukajtis xk = j & 1; 1591*25c28e83SPiotr Jasiukajtis } else { 1592*25c28e83SPiotr Jasiukajtis j = ix >> (20 - k); 1593*25c28e83SPiotr Jasiukajtis if ((j << (20 - k)) == ix && lx == 0) 1594*25c28e83SPiotr Jasiukajtis xk = -2 + (j & 1); 1595*25c28e83SPiotr Jasiukajtis else 1596*25c28e83SPiotr Jasiukajtis xk = j & 1; 1597*25c28e83SPiotr Jasiukajtis } 1598*25c28e83SPiotr Jasiukajtis } 1599*25c28e83SPiotr Jasiukajtis if (xk < 0) 1600*25c28e83SPiotr Jasiukajtis /* ideally gamma(-n)= (-1)**(n+1) * inf, but c99 expect NaN */ 1601*25c28e83SPiotr Jasiukajtis return ((x - x) / (x - x)); /* 0/0 = NaN */ 1602*25c28e83SPiotr Jasiukajtis 1603*25c28e83SPiotr Jasiukajtis 1604*25c28e83SPiotr Jasiukajtis /* negative underflow thresold */ 1605*25c28e83SPiotr Jasiukajtis if (ix > 0x4066e000 || (ix == 0x4066e000 && lx > 11)) { 1606*25c28e83SPiotr Jasiukajtis /* x < -183.0 - 11ulp */ 1607*25c28e83SPiotr Jasiukajtis z = tiny / x; 1608*25c28e83SPiotr Jasiukajtis if (xk == 1) 1609*25c28e83SPiotr Jasiukajtis z = -z; 1610*25c28e83SPiotr Jasiukajtis return (z * tiny); 1611*25c28e83SPiotr Jasiukajtis } 1612*25c28e83SPiotr Jasiukajtis 1613*25c28e83SPiotr Jasiukajtis /* now compute gamma(x) by -1/((sin(pi*y)/pi)*gamma(1+y)), y = -x */ 1614*25c28e83SPiotr Jasiukajtis 1615*25c28e83SPiotr Jasiukajtis /* 1616*25c28e83SPiotr Jasiukajtis * First compute ss = -sin(pi*y)/pi , so that 1617*25c28e83SPiotr Jasiukajtis * gamma(x) = 1/(ss*gamma(1+y)) 1618*25c28e83SPiotr Jasiukajtis */ 1619*25c28e83SPiotr Jasiukajtis y = -x; 1620*25c28e83SPiotr Jasiukajtis j = (int) y; 1621*25c28e83SPiotr Jasiukajtis z = y - (double) j; 1622*25c28e83SPiotr Jasiukajtis if (z > 0.3183098861837906715377675) 1623*25c28e83SPiotr Jasiukajtis if (z > 0.6816901138162093284622325) 1624*25c28e83SPiotr Jasiukajtis ss = kpsin(one - z); 1625*25c28e83SPiotr Jasiukajtis else 1626*25c28e83SPiotr Jasiukajtis ss = kpcos(0.5 - z); 1627*25c28e83SPiotr Jasiukajtis else 1628*25c28e83SPiotr Jasiukajtis ss = kpsin(z); 1629*25c28e83SPiotr Jasiukajtis if (xk == 0) { 1630*25c28e83SPiotr Jasiukajtis ss.h = -ss.h; 1631*25c28e83SPiotr Jasiukajtis ss.l = -ss.l; 1632*25c28e83SPiotr Jasiukajtis } 1633*25c28e83SPiotr Jasiukajtis 1634*25c28e83SPiotr Jasiukajtis /* Then compute ww = gamma(1+y), note that result scale to 2**m */ 1635*25c28e83SPiotr Jasiukajtis m = 0; 1636*25c28e83SPiotr Jasiukajtis if (j < 7) { 1637*25c28e83SPiotr Jasiukajtis ww = gam_n(j + 1, z); 1638*25c28e83SPiotr Jasiukajtis } else { 1639*25c28e83SPiotr Jasiukajtis w = y + one; 1640*25c28e83SPiotr Jasiukajtis if ((lx & 1) == 0) { /* y+1 exact (note that y<184) */ 1641*25c28e83SPiotr Jasiukajtis ww = large_gam(w, &m); 1642*25c28e83SPiotr Jasiukajtis } else { 1643*25c28e83SPiotr Jasiukajtis t = w - one; 1644*25c28e83SPiotr Jasiukajtis if (t == y) { /* y+one exact */ 1645*25c28e83SPiotr Jasiukajtis ww = large_gam(w, &m); 1646*25c28e83SPiotr Jasiukajtis } else { /* use y*gamma(y) */ 1647*25c28e83SPiotr Jasiukajtis if (j == 7) 1648*25c28e83SPiotr Jasiukajtis ww = gam_n(j, z); 1649*25c28e83SPiotr Jasiukajtis else 1650*25c28e83SPiotr Jasiukajtis ww = large_gam(y, &m); 1651*25c28e83SPiotr Jasiukajtis t4 = ww.h + ww.l; 1652*25c28e83SPiotr Jasiukajtis t1 = (double) ((float) y); 1653*25c28e83SPiotr Jasiukajtis t2 = (double) ((float) t4); 1654*25c28e83SPiotr Jasiukajtis /* t4 will not be too large */ 1655*25c28e83SPiotr Jasiukajtis ww.l = y * (ww.l - (t2 - ww.h)) + (y - t1) * t2; 1656*25c28e83SPiotr Jasiukajtis ww.h = t1 * t2; 1657*25c28e83SPiotr Jasiukajtis } 1658*25c28e83SPiotr Jasiukajtis } 1659*25c28e83SPiotr Jasiukajtis } 1660*25c28e83SPiotr Jasiukajtis 1661*25c28e83SPiotr Jasiukajtis /* compute 1/(ss*ww) */ 1662*25c28e83SPiotr Jasiukajtis t3 = ss.h + ss.l; 1663*25c28e83SPiotr Jasiukajtis t4 = ww.h + ww.l; 1664*25c28e83SPiotr Jasiukajtis t1 = (double) ((float) t3); 1665*25c28e83SPiotr Jasiukajtis t2 = (double) ((float) t4); 1666*25c28e83SPiotr Jasiukajtis z1 = ss.l - (t1 - ss.h); /* (t1,z1) = ss */ 1667*25c28e83SPiotr Jasiukajtis z2 = ww.l - (t2 - ww.h); /* (t2,z2) = ww */ 1668*25c28e83SPiotr Jasiukajtis t3 = t3 * t4; /* t3 = ss*ww */ 1669*25c28e83SPiotr Jasiukajtis z3 = one / t3; /* z3 = 1/(ss*ww) */ 1670*25c28e83SPiotr Jasiukajtis t5 = t1 * t2; 1671*25c28e83SPiotr Jasiukajtis z5 = z1 * t4 + t1 * z2; /* (t5,z5) = ss*ww */ 1672*25c28e83SPiotr Jasiukajtis t1 = (double) ((float) t3); /* (t1,z1) = ss*ww */ 1673*25c28e83SPiotr Jasiukajtis z1 = z5 - (t1 - t5); 1674*25c28e83SPiotr Jasiukajtis t2 = (double) ((float) z3); /* leading 1/(ss*ww) */ 1675*25c28e83SPiotr Jasiukajtis z2 = z3 * (t2 * z1 - (one - t2 * t1)); 1676*25c28e83SPiotr Jasiukajtis z = t2 - z2; 1677*25c28e83SPiotr Jasiukajtis 1678*25c28e83SPiotr Jasiukajtis /* check whether z*2**-m underflow */ 1679*25c28e83SPiotr Jasiukajtis if (m != 0) { 1680*25c28e83SPiotr Jasiukajtis hx = __HI(z); 1681*25c28e83SPiotr Jasiukajtis i = hx & 0x80000000; 1682*25c28e83SPiotr Jasiukajtis ix = hx ^ i; 1683*25c28e83SPiotr Jasiukajtis j = ix >> 20; 1684*25c28e83SPiotr Jasiukajtis if (j > m) { 1685*25c28e83SPiotr Jasiukajtis ix -= m << 20; 1686*25c28e83SPiotr Jasiukajtis __HI(z) = ix ^ i; 1687*25c28e83SPiotr Jasiukajtis } else if ((m - j) > 52) { 1688*25c28e83SPiotr Jasiukajtis /* underflow */ 1689*25c28e83SPiotr Jasiukajtis if (xk == 0) 1690*25c28e83SPiotr Jasiukajtis z = -tiny * tiny; 1691*25c28e83SPiotr Jasiukajtis else 1692*25c28e83SPiotr Jasiukajtis z = tiny * tiny; 1693*25c28e83SPiotr Jasiukajtis } else { 1694*25c28e83SPiotr Jasiukajtis /* subnormal */ 1695*25c28e83SPiotr Jasiukajtis m -= 60; 1696*25c28e83SPiotr Jasiukajtis t = one; 1697*25c28e83SPiotr Jasiukajtis __HI(t) -= 60 << 20; 1698*25c28e83SPiotr Jasiukajtis ix -= m << 20; 1699*25c28e83SPiotr Jasiukajtis __HI(z) = ix ^ i; 1700*25c28e83SPiotr Jasiukajtis z *= t; 1701*25c28e83SPiotr Jasiukajtis } 1702*25c28e83SPiotr Jasiukajtis } 1703*25c28e83SPiotr Jasiukajtis return (z); 1704*25c28e83SPiotr Jasiukajtis } 1705