delim @@ 9 9 CHAPTER 2 Data Types in MACSYMA This chapter describes the data types used by MACSYMA. The first table below summarizes the basic data types that are familiar to most mathematicians who have used computers in their work. They are similar to those data types in con- ventional programming languages, although they are in some ways extended or generalized. center box; c s s s l | l | l | l. Table 2.1: Basic Data Type Summary _ Data type Short Name Comments Section _ number num Any numeric type 2-1 through 2-5 integer int Arbitrary length 2-1 rational ratnum Including integers 2-2 floating-point number flonum 2-3 big-float number bfloat 2-4 complex number complex 2-5 Boolean bool literally "true" or "false" 2-6 symbol symbol symbolic literal name 2-7 constant constant identifier 2-8 Not all of the data types are as simple as those above. MACSYMA provides notations for three purposes: (a) a representation language for mathematical data; (b) a com- mand language to manipulate that data; and (c) a programming language for constructing additional notations or commands. These are summarized in the table below. string string messages, etc. 2-9 center box; c s s s l | l | l | l. Table 2.2: Complex Data Type Summary _ Data type Short Name Comments Section _ function name fun a symbol interpreted as a function filename file Just a string, usually "cre"- form cre Canonical Rational Expression "general"- form gen Includes anything else Poisson form pois See section ?? program prog list list A type of general form data matrix mat list of lists any any Any of the Some (I) [an intermediate-level note follows] Because MACSYMA crosses the line between symbolic and programming notations, variables and indeterminates are easily mixed. MACSYMA may never need to think about this, most serious Data Types in MACSYMA 2-2 Data Types in MACSYMA 2-1 users must understand that if one assigns to _x the value 3, say, one must still be able to refer to the symbol _x independent of its current value (e.g. in order to assign it a different value.) In general, MACSYMA will, when presented with a name like _x, "evaluate" it, and use its associated value, say 3. If the name _x has no value associ- ated with it, the name is sometimes said to be "quoted" automatically, and its associated value is the (symbol) _x itself. If it is necessary to refer to the symbol _x itself even though it has an associated value, the notation '_x is used (that's an apostrophe or "single quote" to the left of the symbol.) The confusion is more apparent when one variable's value is (a name of) another variable. _2._1. _N_u_m_b_e_r_s There are several sub-categories of numbers in MACSYMA. Principal representations are integers, rational numbers, floating point numbers, and "bigfloats" plus minor varia- tions. Integers are an essentially accurate model of the mathematical construct "ring of integers". The only limit to the size of a particular integer (or the total sizes of all the integers) is the size of the computer memory you are using. MACSYMA differs from most conventional languages in this. Most others have integers represented by "integers modulo @2 sup n@" or so. The external (printed) representa- tion of an integer is a string of decimal digits. Rational numbers are written as the quotient of two integers: numerator/denominator. MACSYMA uniformly insists on reducing these numbers to a unique form, regardless of the form in which you type them in. The representation used removes common factors from the numerator and denominator, and the denominator is positive. If the numerator is a mul- tiple of the denominator, the rational number is simplified to an integer. Floating point numbers are an approximation to the real numbers, and are supported in MACSYMA because they are sup- ported directly in the computer hardware. They are represented internally in the double-precision* format of the host system. They are written essentially as in FOR- TRAN, i.e. strings of digits containing a period and optionally followed by an integer exponent beginning with the letter e. Their default output format can be altered by the user to provide more or less precision via the function ____________________ 9 * It would be convenient to standardize on the IEEE ar- ithmetic specification, but we do not require it. 9 Printed: May 31, 1986 Data Types in MACSYMA 2-2 _f_l_o_a_t_f_o_r_m_a_t. The input format for bigfloats is similar to that of floating point numbers in "e" format, except the letter b is used to denote the beginning of the exponent. The b is com- pulsory to distinguish bigfloats from the floating point numbers. A bigfloat is also an approximation to a real number but it carries along with it a parameter which is the number of digits carried. This is determined by the value of the MACSYMA variable _f_p_p_r_e_c at the time the number was created. Non-zero floating point numbers on the VAX-11 must have absolute value between about 1.e-38 and 1.7e38 and are limited to approximately 16 digits precision. This is the hardware limitation of the computer. Bigfloats may have any number of digits. The default precision is 16 but the user can change this by setting the variable _f_p_p_r_e_c[DEFAULT: 16] to an integer representing the desired precision. above ____________________________________________________ -17253733574534 6.023e23 -1.6e-19 37.567834987250832568b-98 3.14159 227 -3354665557331/66724255465544 -.7b0 ____________________________________________________ _2._2. _B_o_o_l_e_a_n _V_a_l_u_e_s There are only two Boolean values, namely and In many programming systems these Boolean values are treated dif- ferently from other values or names, but in MACSYMA they are treated much the same as other simple names. They do, how- ever, have special semantics properties since _t_r_u_e _a_n_d _f_a_l_s_e evaluates to _f_a_l_s_e, etc. _2._3. _N_a_m_e_s Names are used to designate variables, functions, and arrays. A name consists of a string of letters (which may include % and _ ) and digits. It may also include other characters but these must be preceded with a \ when typed in. Names can be of any length and must begin with a letter (unless the leading character is a \ ). Upper case letters Printed: May 31, 1986 Data Types in MACSYMA 2-3 may be typed, but they are normally converted into the corresponding lower case letters. ____________________________________________________ %pi epsilon x10y30isastrangename ecial standard_deviation ____________________________________________________ _2._4. _S_t_r_i_n_g_s A string of characters of any length may be constructed by enclosing the string in quotation marks ("). To include a quotation mark, semicolon, or dollar sign in the string you should type a \ to its left. Quoted strings are useful as messages (such as those giving instructions for entering data) or as descriptive titles for printed data. Sometimes you can get the same results by using sym- bols, (since Whello and "hello" print the same), it is safer to use strings, because some symbols will have special pro- perties and reserved usages. If you must include quotes in a string, they may be themselves quoted with \. EXAMPLE: "Input amount in \$" "Riemann's \"Zeta\" Func- tion" _2._5. _A_t_o_m_i_c _V_a_r_i_a_b_l_e_s Names can be used as synonyms for locations in computer memory which store information. Assigning a value to a name is equivalent to storing information in that location. Because of the nature of MACSYMA, these "locations" can store almost any kind of data, regardless of the apparent size or complexity. Names which are assigned values are sometimes referred to as variables, as opposed to mathemati- cal indeterminates. If a variable is "subscripted," the same name, say @x@, could be used for a sequence or array of values by the notation _x[_i] for @x sub i@, etc. If it is important to exclude subscripted variables, we will desig- nate the non-subscripted variety as "atomic variables". _m_y_o_p_t_i_o_n_s[DEFAULT: []] is a MACSYMA variable---an "infolist" which is a list (see Printed: May 31, 1986 Data Types in MACSYMA 2-4 the section later in this chapter on lists) of all the MACSYMA options you set during a session. Subsequently when we describe a command, the names we use will reflect the data types in this table. e.g. x_name will be of type name. _2._6. _G_e_n_e_r_a_l _F_o_r_m Mathematical expressions are constructed by using infix, prefix, postfix and functional-position operators. What this means is, MACSYMA attempts to include the usual types of notation in a linearized form. The usage and priorities of operators, from highest to lowest are: box; l l c. Operator Name Symbol Usage _ factori- als ! !! postfix _ exponentiation ** or ^ infix _ non-commutative ^^ infix exponentiation _ non- commutative . infix multiplication _ division, multi- plication * / infix _ negation - The way this works is that @a*b+c@ is parsed as @( a * b) + c@ because * has higher priority than *. Also, "^" is right-associative, meaning that @a ^ b ^ c@ is @ a sup {b sup c} @ rather than @ ( a sup b ) sup c @, which is equal to @ a sup b*c @. Other operators of equal precedence asso- ciate to the left. EXAMPLE: @ sin (a*b*x^y/z!)"^"2 @ means @ ( sin((a*b)*(x^y))/(z!)) "^"2 @ If you refer to an operator out of the context of an expression, you must enclose it in quotation marks. In the case of particularly unwieldy exponents, _e_x_p_t is used to display exponentiation. ! is the factorial which is the product of all the integers from 1 up to its argument. Thus 5! = 1*2*3*4*5 = 120. The value of the option _f_a_c_t_l_i_m[DEFAULT: -1] gives the highest factorial which is automatically expanded. If it is -1 then all integers are expanded. !! stands for double factorial which is defined as the pro- duct of all the consecutive odd (or even) integers from 1 (or 2) to the odd (or even) argument. Thus 8!! is 2*4*6*8 = Printed: May 31, 1986 Data Types in MACSYMA 2-5 384. Period is used for a non-commutative product operation. You can elaborate on the semantics of this operation by various simplifier facilities described later. In typing this, you must precede and follow it by a space when it would look like a floating point number. Non-commutative exponentiation is provided : @m "^^" 2 @ means @m ~.~m@. Parentheses can be used to change the order of evaluation. Also functional application has the highest priority. The operands may be any MACSYMA expressions whose values are the correct types of data. Recall that every statement in MACSYMA yields a value even if the value is only a trivial one. MACSYMA has no restriction on the mixing of modes of operands. Integers, rationals, floating point numbers, and bigfloats may be freely intermixed in an expression; when conversions are necessary, the priority of conversion is toward the end of the list just mentioned. If floating point numbers or bigfloats of differing precision are com- bined in a operation, they will be converted to floating point or bigfloat numbers of the current precision by pad- ding with zeroes or by dropping off low order digits and rounding. Floating point underflow will return 0.0. _2._7. _F_u_n_c_t_i_o_n_s _a_n_d _A_r_r_a_y_s Sometimes it is useful to associate names with sequences or arrays of values, or to associate names with some rule for the computation of values. For the first of these, arrays are generally used, and for the second, it is possible to write programs, or "functions". Again, the mathematically formal notion of function is inappropriate; the relevant concepts are outlined in appendix ? for the programming novice. _2._7._1. _F_u_n_c_t_i_o_n_s The novel idea, even for most programming language buffs, concerning functions in MACSYMA is that these objects can themselves be manipulated as data in MACSYMA itself. This will perhaps become clear in this section. 9 9 Printed: May 31, 1986 Data Types in MACSYMA 2-6 A function defined by using the command ":=", and asso- ciates a name with a function "body" or "lambda-expression". The lambda-expression provides a fixed format for determin- ing the association of actual parameters or arguments to the function with the names used to specify the computation in the body. Since this paragraph is unlikely to make sense to you unless you already knew what it meant, we will try again via the following example: prefix _ additionn, subtraction +- infix 9(d1) 777778 ____________________________________________________________ ____________________________________________________ (c1) ":="(f,lambda[x,y],2*x+y+z); (d1) 7f ( x , y) := 2^x^+^ y ^+^ z (c2) f(3,4); EQ L (d2) z ^+^ 10 Note the alternative syntax displayed on line d1, in which the function definition is written as a name followed by the formal parameters (here, @x@ and @y@) to the func- tion separated by commas and enclosed in parentheses, and the body of the function is an arbitrary MACSYMA expression involving those parameters and perhaps other objects. The use of the function on line c2 illustrates the actual param- eters (here, 3 and 4) which are associated by position with @x@ and @y@ respectively, then substituted into the function "body" and evaluated. The arguments may be any expressions whatsoever, not just numbers. Now that you understand @lambda@ forms, you could, if you wish, read the section on them in this chapter. There are many more elaborate techniques for function definition which are deferred to the next chapter. What we Printed: May 31, 1986 Data Types in MACSYMA 2-7 wish to identify here is the notion that a name can be a name of a function. _2._7._2. _A_r_r_a_y_s Arrays enable one to refer to a collection of elements by using a single name. An element of an array is referred to by a subscripted variable which is a name followed by a list of subscripts enclosed in square brackets. Arrays in MACSYMA are of two types,* declared or undeclared. Declared arrays are similar to FORTRAN arrays. The user declares the number of dimensions and indicates the maximum value of each subscript. The system then allocates space for the entire array. To declare an array the user types: This sets up a k-dimensional array. The subscripts for the i2th* dimension are the integers running from 0 to 2dimi*. If the user assigns to a subscripted variable without declaring the corresponding array, an undeclared array is set up. Undeclared arrays, otherwise known as hashed arrays (because hash coding is done on the subscripts), are more general than declared arrays. The user does not declare their max- imum size, and they grow dynamically by hashing as more ele- ments are assigned values. The subscripts of undeclared arrays need not even be numbers. However, unless an array is rather sparse, it is probably more efficient to declare it when possible than to leave it undeclared. The ARRAY function can be used to transform an undeclared array into a declared array. Array elements can be assigned values explicitly with the : operator or implicitly by means of an associated function, and the values assigned may be any MACSYMA expression. To understand implicit assignment we must understand what MACSYMA does when asked to evaluate a subscripted variable. MACSYMA first evaluates the subscripts left to right. Then it does an array access to see if the requested array ele- ment already has a value. If it does, the value is returned. If it does not, MACSYMA checks to see whether the array has an associated function (see below). If not, the subscripted variable (with the subscripts evaluated) is returned. (This is standard MACSYMA practice - if there is no value for a ____________________ 9 *For efficient translation, the user can also inform MACSYMA of arrays all of whose elements are of a single type, e.g. FIXNUM, FLOAT, or whose elements are all known before being referenced, called a COMPLETE array. 9 Printed: May 31, 1986 Data Types in MACSYMA 2-8 Printed: May 31, 1986 Data Types in MACSYMA 2-9 variable, the variable itself is returned when an evaluation is done.) If there is an associated function, the parame- ters of the function are bound to the given subscripts, and the function body is evaluated. The value of the function call is stored in the appropriate array element and returned. Note that once an element is computed by the associated function it is stored so that next time it is needed it will not be recomputed. A consequence of this is that unless the user uses the $fun, $fun, or $fun functions (sec. $ref) to kill an array element or the entire array, the associated function will never be called a second time on the same arguments. Thus the user should be aware that even if he redefines the associated function, those values which already exist will stay around. Of course individual array elements can be changed by assignment at any time. These associated functions are defined with the := operator. Their definition looks exactly the same as ordinary function definitions, except that the parameters in the left side of the definition are enclosed in brackets instead of parentheses. In order to use a subscripted variable as a single entity without it being an array and without ever assigning a value to it, it should be prefixed by an apostrophe to avoid it being confused with a non-subscripted variable of the same name. For example SUBST(0,W,W+'W[0]). The MACSYMA variable $var is a list of all the arrays that have been allocated, both declared and unde- clared. $fun $see may be used to display the definition of an array associated function. $fun $see may be used to find out whether an array is declared or undeclared, how large it is, how many subscripts it has, and which elements have values in the case of an undeclared array. 9 9 Printed: May 31, 1986 Data Types in MACSYMA 2-10 ____________________________________________________ (C1) A[N]:=N*A[N-1]$ (C2) A[0]:1$ (C3) A[5]; (D3) 120 (C4) A[N]:=N$ (C5) A[6]; (D5) 6 (Note that the definition in C4 is being used because A[6] has no value up to this time.) (C6) A[4]; (D6) 24 (Since A[4] was assigned a value as a result of A[5] being computed, the new definition is not used.) ____________________________________________________ If one is going to define a recursive function which is to be called several times then if may be more efficient to use an array with an associated function for initialization. The reason is that once an element is computed it is stored and thus need not be computed again whereas with a non- subscripted function, each recursive call may cause a repeat of a past computation. _2._8. _L_a_m_b_d_a _N_o_t_a_t_i_o_n * Note that this section has an asterisk: avoid on first reading. Lambda* notation provides a convention for establishing correspondence between the formal and actual parameters of a ____________________ 9 * Blame it on the logician A. Church, and John McCarthy who picked this up for programming languages. 9 Printed: May 31, 1986 Data Types in MACSYMA 2-11 function, and the function "body" to be evaluated. Lambda notation is useful in MACSYMA primarily in construction and use of "anonymous" un-named functions for use in limited circumstances. It can also be used by the purist to define functions by creating lambda-expressions, and giving them names! ____________________________________________________ (c1) f:lambda([x,y,z],r(x^2)+s(y^2)+v(z^2)); (d1) 7size -2 { lambda ( [ x , y, z ] , v ( z sup 2 )~+~ s ( y sup 2 ) ~+~ r (x sup 2 )) } (c2) f(1,2,a); (d2) 7 FIX THIS UP ____________________________________________________ MACSYMAalso not to get a syntax error they must be sur- rounded by "s. ____________________________________________________ (C3) "+"(1,2,A); (D3) A + 3 ____________________________________________________ MOVE THIS TO NEXT CHAPTER .subsec ("Subscripted Functions (Arrays of Functions)",subscripted!functions,1) It is possible for the value of an array element to be a lambda expression. Thus if the assignment F[1]:LAMBDA([X],X^2+1) were performed, then F[1] could be used in the ordinary prefix functional sense with its argu- ments following in parentheses, e.g. F[1](3) would yield the value 10. There is an alternative syntax available for assigning a lambda expression to an array which introduces the notion of a "subscripted function". In the above case Printed: May 31, 1986 Data Types in MACSYMA 2-12 one could also type F[1](X):=X^2+1 and this would be entirely equivalent. Other elements of the array could be assigned different lambda expressions (or any MACSYMA expressions). If there is an algorithm for computing the different functions to be stored in an array on the basis of the subscripts alone, then one may use an associated func- tion. For example, F[K]:=LAMBDA([X],X^K+1). Again an alter- native syntax of F[K](X):=X^K+1 may be used. The left side of the definition consists of the function name followed by the subscripts, enclosed in brackets, followed by the argu- ments, enclosed in parentheses. The subscripts (which are not evaluated at definition time) must be either all numeric or all symbolic. Note that subscripted functions are treated exactly like arrays so all of the information in sec. $ref applies. In particular when a subscripted function is referenced, the element is immediately retrieved and applied to its arguments if it exists; otherwise it is computed (this time only) and then applied. Consequently, two evaluations of the definition are performed. Thus con- sider the definition F[K](E):=COEFF(E,X,K) and the call F[2](3*X^2-1). Although the user may have thought that this would return the coefficient of X^2 in 3*X^2-1, i.e. 3, it will return 0. The reason is that F[2] is first computed by evaluating the definition yielding 0, since E has not been bound at this time. Note that F[K](E):= SUBST(K,'J,'(COEFF(E,X,J))) would return the desired result as would F(K,E):=COEFF(E,X,K). Thus the user should be clear about the distinction between subscripted functions (a type of array) and ordinary functions. Also a subscripted function should not be redefined without $fun'ing or $fun'ing it first; otherwise the elements which have already been stored will be used. The $var list $see also includes sub- scripted functions. The function $fun $see may also be used on subscripted functions. .example (C1) T[N](X):=RATSIMP(2*X*T[N-1](X)-T[N-2](X))$ 1This generates the Chebyshev polynomials.* (C2) T[0](X):=1$ (C3) T[1](X):=X$ (C4) T[4](Y); 4 2 (D4) 8 Y - 8 Y + 1 (C5) G[N](X):=SUM(EV(X),I,N,N+2)$ (C6) H(N,X):=SUM(EV(X),I,N,N+2)$ (C7) G[2](I^2); Printed: May 31, 1986 Data Types in MACSYMA 2-13 2 (D7) 3 I (C8) H(2,I^2); (D8) 29 The following illustrates a definition for the Legendre polynomials. .example (C9) P[N](X):=RATSIMP(1/(2^N*N!)*DIFF((X^2- 1)^N,X,N))$ (C10) Q(N,X):=RATSIMP(1/(2^N*N!)*DIFF((X^2-1)^N,X,N))$ (C11) P[2]; 2 3 X - 1 (D11) LAMBDA([X], --------) 2 (C12) P[2](Y+1); 2 3 (Y + 1) - 1 (D12) -------------- 2 (C13) Q(2,Y+1); 2 3 Y + 6 Y + 2 (D13) -------------- 2 (C14) P[2](5); (D14) 37 (C15) Q(2,5); 5 attempt to differentiate wrt a number In order to pass a function as an argument to another func- tion you need only give its name in the argument list of the call. It may then be used in the called function by follow- ing the name of the corresponding formal parameter with a parenthesized list of arguments. Subscripted functions $see are passed by giving the name Printed: May 31, 1986 Data Types in MACSYMA 2-14 followed by the subscripts in brackets. Arrays can be passed by giving the name of the array in the argument list and they can be referenced by subscripting the corresponding formal parameter. When passing names of functions or arrays one must take care that there is no atomic variable with the same name which is bound because then that value rather than the name will be passed. In this case the name should be preceded by a ' $see to prevent it from being evaluated. In order to assign to a formal parameter of a function so that the corresponding actual parameter gets changed (and remains changed) when the function is exited, then the :: operator rather than the : operator should be used. 9 9 Printed: May 31, 1986