HPCS lab3, Systemy obliczeniowe wysokiej wydajności

[ Pobierz całość w formacie PDF ]
High performance computing systems
Lab 3
Dept. of Computer Architecture
Faculty of ETI
Gdansk University of Technology
Paweł Czarnul
The following example presents a revised version of the simple master-slave application for parallel
integration of a given function which was presented for lab 2.
As before, the code assumes that the range to be integrated is divided into a predefined number of
subranges. This number is to be considerably larger than the number of processes in the application.
The subranges are distributed among the slave processes so that the load is balanced. However,
the master first distributes subranges to the slaves and sends new subranges to them even before
results for the previously sent parts are gathered. This is to make new subranges available as soon as
possible so that the slave processes do not need to wait for new subranges very long. Similarly,
slave processes start receiving new subranges even before computing the previously received
subranges starts. Then, each slave receives a new subrange and starts sending the results for the
previous subrange to the master along with starting receiving a new subrange.
The code uses MPI non-blocking functions to achieve this task. Please note that blocking and non-
blocking functions can be mixed e.g. an MPI_Send may be received with an MPI_Irecv.
Note that in this particular example processes exchange small messages but the code can be used as
a template for other applications working in the master-slave fashion.
#include <stdio.h>
#include <mpi.h>
#include <math.h>
#include <stdlib.h>
#define PRECISION 0.000001
#define RANGESIZE 1
#define DATA 0
#define RESULT 1
#define FINISH 2
#define DEBUG
double f(double x) {
return sin(x)*sin(x)/x;
}
double SimpleIntegration(double a,double b) {
double i;
double sum=0;
for (i=a;i<b;i+=PRECISION)
sum+=f(i)*PRECISION;
return sum;
}
int main(int argc, char **argv) {
MPI_Request *requests;
int requestcount=0;
int requestcompleted;
int myrank,proccount;
double a=1,b=100;
double *ranges;
double range[2];
double result=0;
double *resulttemp;
int sentcount=0;
int recvcount=0;
int i;
MPI_Status status;
// Initialize MPI
MPI_Init(&argc, &argv);
// find out my rank
MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
// find out the number of processes in MPI_COMM_WORLD
MPI_Comm_size(MPI_COMM_WORLD, &proccount);
if (proccount<2) {
printf("Run with at least 2 processes");
MPI_Finalize();
return -1;
}
if (((b-a)/RANGESIZE)<2*(proccount-1)) {
printf("More subranges needed");
MPI_Finalize();
return -1;
}
// now the master will distribute the data and slave processes will perform computations
if (myrank==0) {
requests=(MPI_Request *)malloc(3*(proccount-1)*sizeof(MPI_Request));
if (!requests) {
printf("\nNot enough memory");
MPI_Finalize();
return -1;
}
ranges=(double *)malloc(4*(proccount-1)*sizeof(double));
if (!ranges) {
printf("\nNot enough memory");
MPI_Finalize();
return -1;
}
resulttemp=(double *)malloc((proccount-1)*sizeof(double));
if (!resulttemp) {
printf("\nNot enough memory");
MPI_Finalize();
return -1;
}
range[0]=a;
// first distribute some ranges to all slaves
for(i=1;i<proccount;i++) {
range[1]=range[0]+RANGESIZE;
#ifdef DEBUG
printf("\nMaster sending range %f,%f to process %d",range[0],range[1],i);
fflush(stdout);
#endif
// send it to process i
MPI_Send(range,2,MPI_DOUBLE,i,DATA,MPI_COMM_WORLD);
sentcount++;
range[0]=range[1];
}
// the first proccount requests will be for receiving, the latter ones for sending
for(i=0;i<2*(proccount-1);i++)
requests[i]=MPI_REQUEST_NULL; // none active at this point
// start receiving for results from the slaves
for(i=1;i<proccount;i++)
MPI_Irecv(&(resulttemp[i-
1]),1,MPI_DOUBLE,i,RESULT,MPI_COMM_WORLD,&(requests[i-1]));
// start sending new data parts to the slaves
for(i=1;i<proccount;i++) {
range[1]=range[0]+RANGESIZE;
#ifdef DEBUG
printf("\nMaster sending range %f,%f to process %d",range[0],range[1],i);
fflush(stdout);
#endif
ranges[2*i-2]=range[0];
ranges[2*i-1]=range[1];
// send it to process i
MPI_Isend(&(ranges[2*i-
2]),2,MPI_DOUBLE,i,DATA,MPI_COMM_WORLD,&(requests[proccount-2+i]));
sentcount++;
range[0]=range[1];
}
while (range[1]<b) {
#ifdef DEBUG
printf("\nMaster waiting for completion of requests");
fflush(stdout);
#endif
// wait for completion of any of the requests
MPI_Waitany(2*proccount-2,requests,&requestcompleted,MPI_STATUS_IGNORE);
// if it is a result then send new data to the process
// and add the result
if (requestcompleted<(proccount-1)) {
result+=resulttemp[requestcompleted];
recvcount++;
#ifdef DEBUG
printf("\nMaster received %d result %f from process
%d",recvcount,resulttemp[requestcompleted],requestcompleted+1);
fflush(stdout);
#endif
// first check if the send has terminated
MPI_Wait(&(requests[proccount-1+requestcompleted]),MPI_STATUS_IGNORE);
// now send some new data portion to this process
range[1]=range[0]+RANGESIZE;
if (range[1]>b) range[1]=b;
#ifdef DEBUG
printf("\nMaster sending range %f,%f to process
%d",range[0],range[1],requestcompleted+1);
fflush(stdout);
#endif
ranges[2*requestcompleted]=range[0];
ranges[2*requestcompleted+1]=range[1];
MPI_Isend(&(ranges[2*requestcompleted]),2,MPI_DOUBLE,requestcompleted+1,DATA,MPI_CO
MM_WORLD,&(requests[proccount-1+requestcompleted]));
sentcount++;
range[0]=range[1];
// now issue a corresponding recv
MPI_Irecv(&(resulttemp[requestcompleted]),1,MPI_DOUBLE,requestcompleted+1,RESULT,MPI_
COMM_WORLD,&(requests[requestcompleted]));
}
}
// now send the FINISHING ranges to the slaves
// shut down the slaves
range[0]=range[1];
[ Pobierz całość w formacie PDF ]

zanotowane.pl

doc.pisz.pl

pdf.pisz.pl

chiara76.opx.pl

HPCS lab3

pdf > download > ebook > pobieranie > do ÂściÂągnięcia

HPCS lab3, Systemy obliczeniowe wysokiej wydajności