CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Page 1 of 2 12 LastLast
Results 1 to 15 of 17
  1. #1
    Join Date
    Oct 2017
    Posts
    50

    A question about multi threading.

    I just wrote a simple ray tracer with multi threading. I divide the rendering process into 8 threads. The result is 3.5 times faster than using single thread.My question is, is it correct use of multi threading.

    Code:
    #define STB_IMAGE_WRITE_IMPLEMENTATION
    
    #include "Sphere.h"
    #include "stb_image_write.h"
    #include <chrono>
    #include <vector>
    #include <thread>
    #include <iostream>
    #include <fstream>
    
    void Render(std::vector<Shape*> pSpheres, uint32_t imageWidth ,uint32_t imageHeightStartIndex, uint32_t imageHeightEndIndex, std::vector<unsigned char>& imageData)
    {
    	HitRecord hitRecord;
    	glm::vec3 rayDirection = glm::vec3(0, 0, -1);
    
    	uint32_t imageDataPartIndex = 0;
    
    	for (uint32_t i = imageHeightStartIndex; i < imageHeightEndIndex; ++i)
    	{
    		for (uint32_t j = 0; j < imageWidth; ++j)
    		{
    			Ray ray(glm::vec3(j, i, 0), rayDirection);
    			bool isHit = false;
    
    			for (uint32_t k = 0; k < pSpheres.size(); ++k)
    			{
    				if (pSpheres[k]->Hit(ray, 0.001f, 1000.0f, &hitRecord))
    					isHit = true;
    			}
    
    			if (isHit)
    			{
    				imageData[imageDataPartIndex] = (unsigned char)(hitRecord.color.x * 255.0f);
    				imageData[imageDataPartIndex + 1] = (unsigned char)(hitRecord.color.y * 255.0f);
    				imageData[imageDataPartIndex+ 2] = (unsigned char)(hitRecord.color.z * 255.0f);
    
    				imageDataPartIndex += 3;
    			}
    			else
    			{
    				imageData[imageDataPartIndex] = 0;
    				imageData[imageDataPartIndex + 1] = 0;
    				imageData[imageDataPartIndex + 2] = 0;
    
    				imageDataPartIndex += 3;
    			}
    		}
    	}
    
    }
    
    int main()
    {
    	std::chrono::system_clock::time_point startTimer = std::chrono::system_clock::now();
    
    	uint32_t imageWidth = 1920;
    	uint32_t imageHeight = 1080;
    
    	std::vector<Shape*> pSpheres;
    	pSpheres.push_back(new Sphere(glm::vec3((float)imageWidth / 2, (float)imageHeight / 2, -2.0f), glm::vec3(1.0f, 0.0f, 0.0f), 400));
    	pSpheres.push_back(new Sphere(glm::vec3(0.0f, 0.0f, -2.0f), glm::vec3(0.0f, 1.0f, 0.0f), 400));
    	pSpheres.push_back(new Sphere(glm::vec3((float)imageWidth, 0.0f, -2.0f), glm::vec3(1.0f, 0.0f, 1.0f), 400));
    	pSpheres.push_back(new Sphere(glm::vec3((float)imageWidth, (float)imageHeight, -2.0f), glm::vec3(1.0f, 1.0f, 1.0f), 400));
    	pSpheres.push_back(new Sphere(glm::vec3(0.0f, (float)imageHeight, -2.0f), glm::vec3(1.0f, 0.0f, 1.0f), 400));
    
    	std::vector<unsigned char> imageData(imageWidth * imageHeight * 3);
    	std::vector<unsigned char> imageDataParts[8];
    	imageDataParts[0].resize(imageWidth * imageHeight * 3 / 8);
    	imageDataParts[1].resize(imageWidth * imageHeight * 3 / 8);
    	imageDataParts[2].resize(imageWidth * imageHeight * 3 / 8);
    	imageDataParts[3].resize(imageWidth * imageHeight * 3 / 8);
    	imageDataParts[4].resize(imageWidth * imageHeight * 3 / 8);
    	imageDataParts[5].resize(imageWidth * imageHeight * 3 / 8);
    	imageDataParts[6].resize(imageWidth * imageHeight * 3 / 8);
    	imageDataParts[7].resize(imageWidth * imageHeight * 3 / 8);
    	std::vector<std::thread> threads(8);
    
    	for (uint32_t i = 0; i < 8; ++i)
    		threads[i] = std::thread(Render, pSpheres, imageWidth, imageHeight / 8 * i, imageHeight / 8 * (i + 1), std::ref(imageDataParts[i]));
    
    	for (uint32_t i = 0; i < 8; ++i)
    		threads[i].join();
    
    	for (uint32_t i = 0; i < 8; ++i)
    		threads[i] = std::thread(memcpy, imageData.data() + (imageDataParts[0].size() * i), imageDataParts[i].data(), imageDataParts[i].size());
    
    	for (uint32_t i = 0; i < 8; ++i)
    		threads[i].join();
    
    	std::chrono::system_clock::time_point endtimer = std::chrono::system_clock::now();
    
    	stbi_write_png("Simple Ray Tracing.png", imageWidth, imageHeight, 3, imageData.data(), 0);
    
    	for (uint32_t i = 0; i < pSpheres.size(); ++i)
    		delete pSpheres[i];
    
    	std::cout << "Time taken to render : " << std::chrono::duration_cast<std::chrono::duration<double>>(endtimer - startTimer).count() << " seconds\n";
    
    	system("PAUSE");
    
    	return 0;
    }

  2. #2
    2kaud's Avatar
    2kaud is offline Super Moderator Power Poster
    Join Date
    Dec 2012
    Location
    England
    Posts
    7,822

    Re: A question about multi threading.

    At first look it seems to be. Your threads don't use shared variables or shared memory so no locking is required. .join() will ensure that all the threads have finished before the next section of code starts. memcpy() isn't copying overlapping memory so no locking should be required.

    However you can somewhat simply the code:

    Code:
    std::chrono::system_clock::time_point startTimer = std::chrono::system_clock::now();
    ...
    std::chrono::system_clock::time_point endtimer = std::chrono::system_clock::now();
    becomes:

    Code:
    const auto startTimer = std::chrono::system_clock::now();
    const auto endtimer = std::chrono::system_clock::now();
    Code:
    imageDataParts[0].resize(imageWidth * imageHeight * 3 / 8);
    imageDataParts[1].resize(imageWidth * imageHeight * 3 / 8);
    imageDataParts[2].resize(imageWidth * imageHeight * 3 / 8);
    imageDataParts[3].resize(imageWidth * imageHeight * 3 / 8);
    imageDataParts[4].resize(imageWidth * imageHeight * 3 / 8);
    imageDataParts[5].resize(imageWidth * imageHeight * 3 / 8);
    imageDataParts[6].resize(imageWidth * imageHeight * 3 / 8);
    imageDataParts[7].resize(imageWidth * imageHeight * 3 / 8);
    becomes:

    Code:
    for (auto& ip : imageDataParts)
        ip.resize(imageWidth * imageHeight * 3 / 8);
    Also I would have imageWidth & imageHeight as const as their values don't change.

    I would have a global const called something like numSphere and set to 8:

    Code:
    const int numSphere = 8;
    and then use this in loops, etc so that the number of spheres can be easily changed.

    Also, in loops unless you need the index value, you can use range-for. So:

    Code:
    for (uint32_t i = 0; i < 8; ++i)
    	threads[i].join();
    becomes:

    Code:
    for (auto& thrd : threads)
        thrd.join();
    All advice is offered in good faith only. All my code is tested (unless stated explicitly otherwise) with the latest version of Microsoft Visual Studio (using the supported features of the latest standard) and is offered as examples only - not as production quality. I cannot offer advice regarding any other c/c++ compiler/IDE or incompatibilities with VS. You are ultimately responsible for the effects of your programs and the integrity of the machines they run on. Anything I post, code snippets, advice, etc is licensed as Public Domain https://creativecommons.org/publicdomain/zero/1.0/ and can be used without reference or acknowledgement. Also note that I only provide advice and guidance via the forums - and not via private messages!

    C++23 Compiler: Microsoft VS2022 (17.6.5)

  3. #3
    Join Date
    Oct 2017
    Posts
    50

    Re: A question about multi threading.

    @2kaud. Thank you boss

  4. #4
    Join Date
    Oct 2017
    Posts
    50

    Re: A question about multi threading.

    @2kaud can you show me example of variable locking

  5. #5
    2kaud's Avatar
    2kaud is offline Super Moderator Power Poster
    Join Date
    Dec 2012
    Location
    England
    Posts
    7,822

    Re: A question about multi threading.

    All advice is offered in good faith only. All my code is tested (unless stated explicitly otherwise) with the latest version of Microsoft Visual Studio (using the supported features of the latest standard) and is offered as examples only - not as production quality. I cannot offer advice regarding any other c/c++ compiler/IDE or incompatibilities with VS. You are ultimately responsible for the effects of your programs and the integrity of the machines they run on. Anything I post, code snippets, advice, etc is licensed as Public Domain https://creativecommons.org/publicdomain/zero/1.0/ and can be used without reference or acknowledgement. Also note that I only provide advice and guidance via the forums - and not via private messages!

    C++23 Compiler: Microsoft VS2022 (17.6.5)

  6. #6
    Join Date
    Oct 2017
    Posts
    50

    Re: A question about multi threading.

    Thank you 2kaud

  7. #7
    2kaud's Avatar
    2kaud is offline Super Moderator Power Poster
    Join Date
    Dec 2012
    Location
    England
    Posts
    7,822

    Re: A question about multi threading.

    If you are going to do much multi-thread work with C++ I strongly suggest you get:

    C++ Concurrency In Action by Anthony Williams (2nd Edition) https://www.amazon.co.uk/s?k=c%2B%2B...nb_sb_ss_i_1_9

    and/or the ebook Concurrency with Modern C++ https://leanpub.com/concurrencywithmodernc
    All advice is offered in good faith only. All my code is tested (unless stated explicitly otherwise) with the latest version of Microsoft Visual Studio (using the supported features of the latest standard) and is offered as examples only - not as production quality. I cannot offer advice regarding any other c/c++ compiler/IDE or incompatibilities with VS. You are ultimately responsible for the effects of your programs and the integrity of the machines they run on. Anything I post, code snippets, advice, etc is licensed as Public Domain https://creativecommons.org/publicdomain/zero/1.0/ and can be used without reference or acknowledgement. Also note that I only provide advice and guidance via the forums - and not via private messages!

    C++23 Compiler: Microsoft VS2022 (17.6.5)

  8. #8
    Join Date
    Feb 2017
    Posts
    677

    Re: A question about multi threading.

    I'm just curious. How many CPU cores do you have?

  9. #9
    Join Date
    Oct 2017
    Posts
    50

    Re: A question about multi threading.

    @wolle my cpu is Intel Core i7 6700hq. It has 4 cores

  10. #10
    Join Date
    Feb 2017
    Posts
    677

    Re: A question about multi threading.

    Quote Originally Posted by noobofcpp View Post
    It has 4 cores
    I kind of expected that since you reported a speed increase of 3.5 times. It also indicates your code works as expected.

    is it correct use of multi threading
    It's correct but not the only way. Another (complementary) approach is to focus on tasks rather than threads. The C++ standard library has recently been upgraded to make this view easier. For example most algorithms in the algorithms library,

    https://en.cppreference.com/w/cpp/algorithm

    can be called with a parameter representing an execution policy. It allows the programmer to easily switch between sequential and parallel execution. One example is std::for_each.

    Say you turn your Render function into a class, say RenderTask. Then you use it to split the whole task into 8 sub-tasks very much like you do now. You store 8 RenderTask objects in a vector. Then if you loop through the vector using std::for_each with the parallel policy you get the same solution you have now with the difference that you don't have to deal with any threads explicitly. They are managed behind the scenes by the C++ runtime system.

    Sometimes you want to manage threads yourself but often you just want to turn sequential execution into parallel and then chances are a suitable algorithm can be found in the standard library.
    Last edited by wolle; May 26th, 2019 at 05:05 AM.

  11. #11
    Join Date
    Oct 2017
    Posts
    50

    Re: A question about multi threading.

    @wolle. Time taken to render with one thread is 0.133174 seconds while the multi threaded one is 0.0426739 seconds. What about Open MP? According to this post https://stackoverflow.com/questions/...vs-c11-threads it's says Open MP is faster.
    Last edited by noobofcpp; May 26th, 2019 at 08:16 AM.

  12. #12
    2kaud's Avatar
    2kaud is offline Super Moderator Power Poster
    Join Date
    Dec 2012
    Location
    England
    Posts
    7,822

    Re: A question about multi threading.

    The way that Open MP and c++11 multi-threading works is different. As that reference mentions, Open MP is usually implemented using thread pools - whereas with c++ its done with creating specific threads. Creating threads is time consuming. In some instances (lots of threads created that do very little CPU work) using multi-threading can lead to slower execution than without them. Windows supports thread pools so I would expect in some circumstances that Open MP would be faster. The more CPU work that is done within a thread, the less I would expect the difference. In your code, in total you are creating and destroying 16 threads - which has an overhead.

    Also note that if you have 4 cores then only 4 threads can run active at any one time if they are all using CPU. If you reduce your thread count to 4 from 8, what are the new timings?
    All advice is offered in good faith only. All my code is tested (unless stated explicitly otherwise) with the latest version of Microsoft Visual Studio (using the supported features of the latest standard) and is offered as examples only - not as production quality. I cannot offer advice regarding any other c/c++ compiler/IDE or incompatibilities with VS. You are ultimately responsible for the effects of your programs and the integrity of the machines they run on. Anything I post, code snippets, advice, etc is licensed as Public Domain https://creativecommons.org/publicdomain/zero/1.0/ and can be used without reference or acknowledgement. Also note that I only provide advice and guidance via the forums - and not via private messages!

    C++23 Compiler: Microsoft VS2022 (17.6.5)

  13. #13
    Join Date
    Oct 2017
    Posts
    50

    Re: A question about multi threading.

    with four threads it took 0.0625277 seconds while 8 threads it took 0.0441792 secondsName:  Untitled.jpg
Views: 528
Size:  21.4 KB

  14. #14
    2kaud's Avatar
    2kaud is offline Super Moderator Power Poster
    Join Date
    Dec 2012
    Location
    England
    Posts
    7,822

    Re: A question about multi threading.

    Are you using hyper-threading? If yes, that would give an effect of 8 CPU's. If you do Task Manager/Performance, how many CPU graphs are shown?
    All advice is offered in good faith only. All my code is tested (unless stated explicitly otherwise) with the latest version of Microsoft Visual Studio (using the supported features of the latest standard) and is offered as examples only - not as production quality. I cannot offer advice regarding any other c/c++ compiler/IDE or incompatibilities with VS. You are ultimately responsible for the effects of your programs and the integrity of the machines they run on. Anything I post, code snippets, advice, etc is licensed as Public Domain https://creativecommons.org/publicdomain/zero/1.0/ and can be used without reference or acknowledgement. Also note that I only provide advice and guidance via the forums - and not via private messages!

    C++23 Compiler: Microsoft VS2022 (17.6.5)

  15. #15
    Join Date
    Oct 2017
    Posts
    50

    Re: A question about multi threading.

    Hyper threading is one of my CPU features, so i think yes. There are 8 CPU graph on task manager.

Page 1 of 2 12 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured