CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 2 of 2
  1. #1
    Join Date
    Mar 2024
    Posts
    1

    OpenMP compute pi

    Hello,
    I am working my way into OpenMP.

    The following code is supposed to compute pi.
    It works fine for one thread, but in case of multiple threads the relative error gets larger. Can anyone spot the mistake? It should be trivial...

    Code:
    #include <iostream>
    #include <omp.h>
    #include <chrono>
    
    
    int main(){
    
    
        long int N = 1000000000;
        long double dx = 1./N;
    
        long double result = 0;
    
        
        auto t1 = std:: chrono::high_resolution_clock::now();
        
        omp_set_num_threads(2);  // works only for 1
        #pragma omp parallel
        {
            long int rank = omp_get_thread_num();
            long int nr_ranks = omp_get_num_threads();
            
            long int n_per_rank = N/nr_ranks;
            long int N_start = rank*n_per_rank;
            long int N_end = N_start + n_per_rank;
            
            // In case N not evenly divisible by ranks, last rank does remaining iterations.
            if (N % nr_ranks != 0  &&  rank == nr_ranks-1){
                N_end += N % nr_ranks;
            } 
    
            double x_start = N_start*dx;
    
            // Main loop.
            for(long int i=N_start; i<N_end; ++i){
                result += 4/(1+x_start*x_start)*dx;
                x_start += dx;
            }
    
        }
    
        auto t2 = std:: chrono:: high_resolution_clock:: now();
        auto ms_int = std:: chrono:: duration_cast < std:: chrono:: seconds > (t2 - t1);
        std:: cout << "Execution took " << ms_int.count() << " seconds!\n";
        
        std::cout<<"error is "<<(result-3.14159265358979323846)/3.14159265358979323846<<"\n";
    
    
    return 0;
    }

  2. #2
    Join Date
    Nov 2018
    Posts
    122

    Re: OpenMP compute pi

    Yeah, you're not calculating PI at all.

    You need to flag your result as being shared between multiple threads, otherwise you're losing updates to it.
    Code:
    #include <iostream>
    #include <omp.h>
    #include <chrono>
    
    
    int main(){
    
    
        const long int N = 1000000000;
        const long double dx = 1./N;
    
        long double result = 0;
    
        
        auto t1 = std:: chrono::high_resolution_clock::now();
        
        omp_set_num_threads(2);  // works only for 1
        #pragma omp parallel
        {
            long double result_by_rank = 0;
            long int rank = omp_get_thread_num();
            long int nr_ranks = omp_get_num_threads();
            
            long int n_per_rank = N/nr_ranks;
            long int N_start = rank*n_per_rank;
            long int N_end = N_start + n_per_rank;
            
            // In case N not evenly divisible by ranks, last rank does remaining iterations.
            if (N % nr_ranks != 0  &&  rank == nr_ranks-1){
                N_end += N % nr_ranks;
            } 
    
            double x_start = N_start*dx;
    
            // Main loop.
            for(long int i=N_start; i<N_end; ++i){
                result += 4/(1+x_start*x_start)*dx;
                result_by_rank += 4/(1+x_start*x_start)*dx;
                x_start += dx;
            }
            std::cout << "Rank:" << rank
                << " start:" << N_start
                << " end:" << N_end
                << " result:" << result
                << " result_by_rank:" << result_by_rank
                << "\n";
        }
    
        auto t2 = std:: chrono:: high_resolution_clock:: now();
        auto ms_int = std:: chrono:: duration_cast < std:: chrono:: seconds > (t2 - t1);
        std:: cout << "Execution took " << ms_int.count() << " seconds!\n";
        std::cout << "Final result=" << result << "\n";
        std::cout<<"error is "<<(result-3.14159265358979323846)/3.14159265358979323846<<"\n";
    
    
    return 0;
    }
    
    $ g++ -fopenmp foo.cpp
    $ ./a.out 
    Rank:1 start:500000000 end:1000000000 result:1.03382 result_by_rank:1.287
    Rank:0 start:0 end:500000000 result:1.86255 result_by_rank:1.85459
    Execution took 11 seconds!
    Final result=1.86255
    error is -0.407131
    The sum of all the result_by_rank is a lot better because each one is private to a thread.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured