This page contains PARLANSE programs to:
- Compute Prime Numbers using Speculation
- Calculate Pi Using Divide-and-Conquer Integration (includes some speedup curves for an 8-processor system)
This PARLANSE program runs the Sieve of Eratosthenese using 2 parallel grains (the (|| operator, below) performing speculative computation. For each new candidate primes, each grain tries half the candidate divisors. Speculative parallelism occurs in that one grain may continue trying divisors after the other has discovered that the candidate is composite. Exception handing ( (try clause and (raise operator) is used to abort the speculating grain to avoid wasting too much computation. The (;; operator causes sequential execution; PARLANSE forces programmers to say this when necessary to encourage them to "think parallel". No, PARLANSE is not a dialect of LISP; this is just convenient syntax.
[nprimes natural] [primes (array natural 1 65536)] [lasttrialdivisorindex natural] (define Composite (exception (lambda (function string void) `Not prime!'))) (define primeq (lambda (function boolean natural) (try (value (|| (local (= [i natural] 2) (while (<= i lasttrialdivisorindex) (ifthenelse (== (modulo ? primes:i) 0) (raise Composite) (+= i 2 ) )ifthenelse )while )local (local (= [i natural] 3) (while (<= i lasttrialdivisorindex) (ifthenelse (== (modulo ? primes:i) 0) (raise Composite) (+= i 2 ) )ifthenelse )while )local )|| ~t)value ; no exception --> prime number! (acknowledge ~f) ; handle exception, not a prime! )try )lambda )define (define main (action (procedure void) (;; (= nprimes 1) (= primes:1 2) (= lasttrialdivisorindex 1) (do [candidate natural] 000000003 4000000000 2 (;; (ifthen (< (* primes:lasttrialdivisorindex primes:lasttrialdivisorindex) candidate) (+= lasttrialdivisorindex) )ifthen (ifthen (primeq candidate) (;; (+= nprimes) (ifthen (<= nprimes 65536) (= primes:nprimes candidate) )ifthen (ifthen (== (modulo nprimes 1000) 0) (;; (PutConsoleNatural nprimes)(PCSP) (PutConsoleNatural primes:lasttrialdivisorindex)(PCSP) (PutConsoleNatural candidate)(PCSP) (PCNL) );; )ifthen );; )ifthen );; )do );; )action )define
This program computes Pi by integration. It obtains parallelism by recursively subdividing the integration region, and forking (the "(||" operator) parallel grains for each region. The PARLANSE runtime system manages the creation of grains and keeps space demand reasonable. A speedup table follows the program.
`Program to compute Pi by integration using Divide-and-conquer parallelism' (define dx `Step size for computing Pi by Integration' 1e-8) (define Efficient_Work_per_Grain `min work given to a processor' 100.0) [LoopOverhead float] (include `c:\utility\parlanse\include\parmodule.par') (define calculate_part (lambda (function float (structure [start_point float] [end_point float])) )lambda )define (define calculate_part (lambda (function float (structure [start_point float] [end_point float])) (let (;; [mid_point float] [area_sum float] [x float] [first_half_area float] [second_half_area float] );; (ifthenelse (<= (/ (- end_point start_point) dx) Efficient_Work_per_Grain) (value (;; (= area_sum 0.0) (= x start_point) (while (< x end_point) (;; (+= area_sum (* dx (/ 4.0 (+ 1.0 (* x x))))) (+= x dx) );; )while );; (- area_sum (* (- x end_point) (/ 4.0 (+ 1.0 (* (- x dx) (- x dx)))))* )- )value (value (;; (= mid_point (+ start_point (/ (- end_point start_point)- 2.0)/ )+ )= (|| (= first_half_area (calculate_part start_point mid_point)) (= second_half_area (calculate_part mid_point end_point)) )|| );; (+ first_half_area second_half_area) )value )ifthenelse )let )lambda )define (define main (action (procedure void) (;; (PAR:PutConsoleFloat (calculate_part 0.0 1.0)) );; )action )
Execution time on Data General AV8600 (1/24/98)
(8-way 200Mhz Pentium Pro 512Kb cache)
The computed value of Pi is 3.141592657874785 in all runs..
# Processors Execution Time Speedup (logical) (seconds) (over 1 processor) 1 77.55 1 2 38.74 2.0 3 25.98 2.98 4 19.64 3.95 5 15.87 4.89 6 13.33 5.82 7 11.55 6.71 8 10.20 7.60 9* 10.40 7.46 16* 11.42 6.79
Note that a nearly perfect linear speedup is obtained, with extremely little effort on the part of the programmer. A simple formula estimating speedup for this program is:
speedup = NCPUs * ( 136 / ( 135 + NCPUs) )
(*PARLANSE will run on any system, using any number of logical processors, even if that value exceeds the number of physical processors available. Asking PARLANSE to run with more logical processors than physical processors causes some extra overhead as Windows multiplexes physical processors onto the threads used by PARLANSE, and that lowers the efficiency somewhat.)