User Panel
|
Quoted:
Eh I've never seen those two words used in a sentence before like that, you are joking right? View Quote View All Quotes View All Quotes Quoted:
Quoted:
Get massively parallel, download some free AI software from Google and IBM, and build an intelligent being Eh I've never seen those two words used in a sentence before like that, you are joking right? IBM is letting people try Watson's analytics for free, but they're doing it on the SoftLayer infrastructure (IBM bought SL a year and a half ago). You can hand it a big ole chunk of data and let it do it's thing. I don't think they allow that code to leave and run on anything else. |
|
It does it never ceases to amaze me
you should consider writing about your adventures with the parallella |
|
|
|
Cool.
I upgraded my home Cisco CUCM, Unity Connection and IM&P servers from version 9.1 to version 10.5 last night. |
|
Quoted: Eh I've never seen those two words used in a sentence before like that, you are joking right? View Quote View All Quotes View All Quotes Quoted: Quoted: Get massively parallel, download some free AI software from Google and IBM, and build an intelligent being Eh I've never seen those two words used in a sentence before like that, you are joking right? https://code.google.com/p/word2vec/ IBM's Watson, which won at Jeopardy, was entirely built with Open Source Software Here is one article but there are articles describing exactly how to build it with free software http://www.aaai.org/Magazine/Watson/watson.php |
|
If any fellow geeks are interested in developing FPGA in C my latest blog just posted
http://forums.xilinx.com/t5/Xcell-Daily-Blog/Adam-Taylor-s-MicroZed-Chronicles-Part-85-SDSoC-the-first/ba-p/633707 |
|
Quoted: If any fellow geeks are interested in developing FPGA in C my latest blog just posted http://forums.xilinx.com/t5/Xcell-Daily-Blog/Adam-Taylor-s-MicroZed-Chronicles-Part-85-SDSoC-the-first/ba-p/633707 View Quote You misspelled 'installment' |
|
Quoted:
If any fellow geeks are interested in developing FPGA in C my latest blog just posted http://forums.xilinx.com/t5/Xcell-Daily-Blog/Adam-Taylor-s-MicroZed-Chronicles-Part-85-SDSoC-the-first/ba-p/633707 View Quote Thanks! You most surely need to be more active in the nerd threads! Everything from Arduino to FPGA gets a thread every other week or so around here. |
|
Got my server version parallella in the mail today.
Next week I will order the cluster case and start building that out. |
|
I will try to I have a load of Arduinos floating around the study.
|
|
|
Quoted: Very exciting any ideas what application you are going to run on it. View Quote View All Quotes View All Quotes Quoted: Quoted: Got my server version parallella in the mail today. Next week I will order the cluster case and start building that out. Very exciting any ideas what application you are going to run on it. I just like to tinker, doubt I will do anything that puts Adapteva on the map. I like astronomy and astrophysics stuff, have found source code repositories for that on the web, may start out trying port some of those to parallella, I think the lack of double precision in Epiphany may hinder some of that though. |
|
lol, no. I just like to tinker, doubt I will do anything that puts Adapteva on the map. I like astronomy and astrophysics stuff, have found source code repositories for that on the web, may start out trying port some of those to parallella, I think the lack of double precision in Epiphany may hinder some of that though. View Quote interesting my day job involves astronomy and space systems |
|
Quoted: interesting my day job involves astronomy and space systems View Quote View All Quotes View All Quotes Quoted: lol, no. I just like to tinker, doubt I will do anything that puts Adapteva on the map. I like astronomy and astrophysics stuff, have found source code repositories for that on the web, may start out trying port some of those to parallella, I think the lack of double precision in Epiphany may hinder some of that though. interesting my day job involves astronomy and space systems I have a wall street background so that is another possibility. |
|
Quoted: "wall street background" that sounds pretty cool View Quote Too long ago to remember much, but I saw some CUDA example option pricing code I may play with. Again, the lack of double precision I think is the biggest hinderance to Adapteva on these versions of Epiphany. The low power requirements are cool, but serious work needs serious precision, and people who need that can pay for the power (Wall St) or get grants (academia).
|
|
|
Quoted: CUDA Coding... Maybe you can tell my why an app I wrote works on a GTX 460 but won't work on a GTX 960, even after being re-compiled? Am I missing something? View Quote View All Quotes View All Quotes Quoted: Quoted: Too long ago to remember much, but I saw some CUDA example option pricing code I may play with. CUDA Coding... Maybe you can tell my why an app I wrote works on a GTX 460 but won't work on a GTX 960, even after being re-compiled? Am I missing something? |
|
Just working on a pretty simple nbody simulation for now, since astronomy subjects interest me.
Once I get it working ok on one parallella, I will work on trying to get MPI to distribute it across multiple boards. For me, the programming challenges are more interesting than the end result sometimes. geeks are easily amused. When I am kayaking or boating I am thinking about solutions to these kinds of challenges all the time. Others are thinking 'tits' the whole time. |
|
Haven't shipped the other board yet. It should go out tomorrow.
Quoted:
Just working on a pretty simple nbody simulation for now, since astronomy subjects interest me. Once I get it working ok on one parallella, I will work on trying to get MPI to distribute it across multiple boards. For me, the programming challenges are more interesting than the end result sometimes. geeks are easily amused. When I am kayaking or boating I am thinking about solutions to these kinds of challenges all the time. Others are thinking 'tits' the whole time. View Quote |
|
|
|
Quoted:
I'm in the middle of going from 8.5 to 10.5 at work. It's way past time. View Quote View All Quotes View All Quotes Quoted:
Quoted:
Cool. I upgraded my home Cisco CUCM, Unity Connection and IM&P servers from version 9.1 to version 10.5 last night. I'm in the middle of going from 8.5 to 10.5 at work. It's way past time. If you need any advice, let me know. I've done 6 or 8 of those. |
|
Quoted:
The code that runs on the epiphany cores int main(void) { e_coreid_t coreid; unsigned row, col; char *outbuf; int sqr; int core; outbuf = (char *) 0x0000; // Who am I? Query the CoreID from hardware. coreid = e_get_coreid(); e_coords_from_coreid(coreid, &row, &col); core = (4 * row) + (col + 1); sqr = core * core; sprintf(outbuf, "Greetings from core 0x%03x! These are my coordinates: %d, %d - sqr = %d", coreid, row, col, sqr); return EXIT_SUCCESS; } View Quote Question, you have outbuf as a pointer to 0x0000, how does that give output, or is it going to the 0x0100000 address from the C code? Little slow tonight.. |
|
Quoted: Question, you have outbuf as a pointer to 0x0000, how does that give output, or is it going to the 0x0100000 address from the C code? Little slow tonight.. View Quote View All Quotes View All Quotes Quoted: Quoted: The code that runs on the epiphany cores int main(void) { e_coreid_t coreid; unsigned row, col; char *outbuf; int sqr; int core; outbuf = (char *) 0x0000; // Who am I? Query the CoreID from hardware. coreid = e_get_coreid(); e_coords_from_coreid(coreid, &row, &col); core = (4 * row) + (col + 1); sqr = core * core; sprintf(outbuf, "Greetings from core 0x%03x! These are my coordinates: %d, %d - sqr = %d", coreid, row, col, sqr); return EXIT_SUCCESS; } Question, you have outbuf as a pointer to 0x0000, how does that give output, or is it going to the 0x0100000 address from the C code? Little slow tonight.. The host program running on the ARM can read each cores memory to retrieve the string and print it out. In the host code this is where I defined it to run on all 16 cores, that e_open() is creating a workgroup of cores that is 4 cores wide by 4 cores deep. e_open(&dev, 0, 0, 4, 4); e_reset_group(&dev); And this is where it is reading the string from the cores memory, row and col is the cores position in the 4x4 workgroup. e_read(&dev, row, col, 0x0000, emsg, _BufSize); I don't need this line in the code posted on page 2: e_alloc(&emem, _BufOffset, _BufSize); The original example I was hacking up was passing the string back in shared dram, another way of passing data between the host and core. |
|
Quoted:
If you need any advice, let me know. I've done 6 or 8 of those. View Quote View All Quotes View All Quotes Quoted:
Quoted:
Quoted:
Cool. I upgraded my home Cisco CUCM, Unity Connection and IM&P servers from version 9.1 to version 10.5 last night. I'm in the middle of going from 8.5 to 10.5 at work. It's way past time. If you need any advice, let me know. I've done 6 or 8 of those. Wil do. I'm getting rusty, having not worked for a Cisco partner in 6 years. |
|
Quoted:
It is storing that string created by the sprintf() command in local memory for that epiphany core at address 0, this code runs on all 16 cores. The host program running on the ARM can read each cores memory to retrieve the string and print it out. In the host code this is where I defined it to run on all 16 cores, that e_open() is creating a workgroup of cores that is 4 cores wide by 4 cores deep. e_open(&dev, 0, 0, 4, 4); e_reset_group(&dev); And this is where it is reading the string from the cores memory, row and col is the cores position in the 4x4 workgroup. e_read(&dev, row, col, 0x0000, emsg, _BufSize); I don't need this line in the code posted on page 2: e_alloc(&emem, _BufOffset, _BufSize); The original example I was hacking up was passing the string back in shared dram, another way of passing data between the host and core. View Quote View All Quotes View All Quotes Quoted:
Quoted:
Quoted:
The code that runs on the epiphany cores int main(void) { e_coreid_t coreid; unsigned row, col; char *outbuf; int sqr; int core; outbuf = (char *) 0x0000; // Who am I? Query the CoreID from hardware. coreid = e_get_coreid(); e_coords_from_coreid(coreid, &row, &col); core = (4 * row) + (col + 1); sqr = core * core; sprintf(outbuf, "Greetings from core 0x%03x! These are my coordinates: %d, %d - sqr = %d", coreid, row, col, sqr); return EXIT_SUCCESS; } Question, you have outbuf as a pointer to 0x0000, how does that give output, or is it going to the 0x0100000 address from the C code? Little slow tonight.. The host program running on the ARM can read each cores memory to retrieve the string and print it out. In the host code this is where I defined it to run on all 16 cores, that e_open() is creating a workgroup of cores that is 4 cores wide by 4 cores deep. e_open(&dev, 0, 0, 4, 4); e_reset_group(&dev); And this is where it is reading the string from the cores memory, row and col is the cores position in the 4x4 workgroup. e_read(&dev, row, col, 0x0000, emsg, _BufSize); I don't need this line in the code posted on page 2: e_alloc(&emem, _BufOffset, _BufSize); The original example I was hacking up was passing the string back in shared dram, another way of passing data between the host and core. Isn't your current program passing the core info back and forth in ram? I guess my point of confusion seems to be that all cores are writing to the same hard coded *outbuf, so they'd overwrite the data at 0x000 when called from each core. I guess I would use the sprintf return value to add to outbuf so *outbuf was "clean" RAM, or are you randomly printing which core has written to that area most recently? I tend to veer away from using the 0x0 RAM address, myself on architectures different than this one, I go from top down. Is there an installed RAM call on the parallel cores, or is that something the ARM would pass to them? |
|
Quoted: Isn't your current program passing the core info back and forth in ram? I guess my point of confusion seems to be that all cores are writing to the same hard coded *outbuf, so they'd overwrite the data at 0x000 when called from each core. I guess I would use the sprintf return value to add to outbuf so *outbuf was "clean" RAM, or are you randomly printing which core has written to that area most recently? I tend to veer away from using the 0x0 RAM address, myself on architectures different than this one, I go from top down. Is there an installed RAM call on the parallel cores, or is that something the ARM would pass to them? View Quote View All Quotes View All Quotes Quoted: Quoted: Quoted: Quoted: Question, you have outbuf as a pointer to 0x0000, how does that give output, or is it going to the 0x0100000 address from the C code? Little slow tonight.. The host program running on the ARM can read each cores memory to retrieve the string and print it out. In the host code this is where I defined it to run on all 16 cores, that e_open() is creating a workgroup of cores that is 4 cores wide by 4 cores deep. e_open(&dev, 0, 0, 4, 4); e_reset_group(&dev); And this is where it is reading the string from the cores memory, row and col is the cores position in the 4x4 workgroup. e_read(&dev, row, col, 0x0000, emsg, _BufSize); I don't need this line in the code posted on page 2: e_alloc(&emem, _BufOffset, _BufSize); The original example I was hacking up was passing the string back in shared dram, another way of passing data between the host and core. Isn't your current program passing the core info back and forth in ram? I guess my point of confusion seems to be that all cores are writing to the same hard coded *outbuf, so they'd overwrite the data at 0x000 when called from each core. I guess I would use the sprintf return value to add to outbuf so *outbuf was "clean" RAM, or are you randomly printing which core has written to that area most recently? I tend to veer away from using the 0x0 RAM address, myself on architectures different than this one, I go from top down. Is there an installed RAM call on the parallel cores, or is that something the ARM would pass to them? This is telling the the host program to read the local memory of the core at row, col in the workgroup e_read(&dev, row, col, 0x0000, emsg, _BufSize); |
|
Freakin parallel processing...and here I am having trouble with an Arduino project that is missing keypad scanning inputs at times. I'm a tard.
|
|
Just snagged a brand new parallella on eBay for $88
Going to order the 4 board case tomorrow and have a nice little mini cluster built out soon. The case comes from UK though so may take a while to get. |
|
Quoted:Each core has it's own local memory it is writing to, and that is what the host program is reading from.
This is telling the the host program to read the local memory of the core at row, col in the workgroup e_read(&dev, row, col, 0x0000, emsg, _BufSize); View Quote Ahh, got it. I thought it was all using the same RAM pool and addresses. |
|
How much local RAM does each core get? Is that merged as part of the main RAM, or part of the FPGA?
I guess I should start saving for one. |
|
Quoted: How much local RAM does each core get? Is that merged as part of the main RAM, or part of the FPGA? I guess I should start saving for one. View Quote https://www.parallella.org/2015/02/28/parallella-chronicles-part-five/ |
|
Getting close to having version 0.0001 of an nbody astronomy simulation coded.
Those are the stars orbiting each other type of simulation. No graphics yet, just sending the 'stars' to each core to process. Once I get this working ok on a single board, I will work on getting it distributed across 4 parallella boards and using all 64 cores (using MPI I guess), and add some graphics to display the motions of the stars. This is an nvidia example nbody simulation in case anyone is 'WTF' is an nbody simulation. Mine won't be nearly as nice or fast, lol |
|
|
This might be something to tackle first, a fluid simulation with N particles (no gravity, just collisions). It's written for Processing, an easy to use language for PC, but the algorithms are there.
I'd like to see the results, the native code is a bit slow once the particle count goes up in WebGL, full visual water simulation in browser if you have a 3D card Here (fun to play with even if you don't want to code it) Just a couple of ideas. Is just the C source code for the parallelia Bernoulli set program available somewhere without downloading the entire package? |
|
Quoted: This might be something to tackle first, a fluid simulation with N particles (no gravity, just collisions). It's written for Processing, an easy to use language for PC, but the algorithms are there. I'd like to see the results, the native code is a bit slow once the particle count goes up in WebGL, full visual water simulation in browser if you have a 3D card Here (fun to play with even if you don't want to code it) Just a couple of ideas. Is just the C source code for the parallelia Bernoulli set program available somewhere without downloading the entire package? View Quote You want the mandelbrot one? Here: http://github.com/parallella/parallella-examples/tree/master/mandelbrot |
|
This line is killing my nbody sim performance on the 16 cores
invDist = 1.0f / sqrtf(distSqr); Epiphany being a RISC processor doesn't have division as a built in function, and God knows how that sqrtf() function is being handled. Sigh, the joys of being a coder. Tons more googling to do on this. I did order the the 4 board cluster case for it today, $167 after exchange rates and shipping from UK, lol. |
|
My parallel code is running on the 16 cores and I am getting the same results from code running strictly on the ARM processor, so I am pretty confident my parallel logic is sound, just that division and square root headache to overcome.
And people think computer geeks have a cushy job in an air conditioned cube all day. |
|
Quoted:
This line is killing my nbody sim performance on the 16 cores invDist = 1.0f / sqrtf(distSqr); Epiphany being a RISC processor doesn't have division as a built in function, and God knows how that sqrtf() function is being handled. Sigh, the joys of being a coder. View Quote Don't use the library sqrt function. Use an iterative Newton's Approximation of Square root to the accuracy you desire. It will run faster than the library function and be accurate to a few significant digits. It all depends in how accurate the sqrt must be, how much error can be tolerated. For visual things, you can use a 2 decimal point square root and be close, while being extremely fast. There are other sqrt algorithms, depending on accuracy vs. speed. The sqrtf() function goes for accuracy. Newton's approximation uses the handy ½, which is a right shift, and the rest is addition and multiplication. |
|
Quoted: Don't use the library sqrt function. Use an iterative Newton's Approximation of Square root to the accuracy you desire. It will run faster than the library function and be accurate to a few significant digits. It all depends in how accurate the sqrt must be, how much error can be tolerated. For visual things, you can use a 2 decimal point square root and be close, while being extremely fast. There are other sqrt algorithms, depending on accuracy vs. speed. The sqrtf() function goes for accuracy. Newton's approximation uses the handy ½, which is a right shift, and the rest is addition and multiplication. View Quote View All Quotes View All Quotes Quoted: Quoted: This line is killing my nbody sim performance on the 16 cores invDist = 1.0f / sqrtf(distSqr); Epiphany being a RISC processor doesn't have division as a built in function, and God knows how that sqrtf() function is being handled. Sigh, the joys of being a coder. Don't use the library sqrt function. Use an iterative Newton's Approximation of Square root to the accuracy you desire. It will run faster than the library function and be accurate to a few significant digits. It all depends in how accurate the sqrt must be, how much error can be tolerated. For visual things, you can use a 2 decimal point square root and be close, while being extremely fast. There are other sqrt algorithms, depending on accuracy vs. speed. The sqrtf() function goes for accuracy. Newton's approximation uses the handy ½, which is a right shift, and the rest is addition and multiplication. The division is a problem too though |
|
Quoted:
Thanks, I will look into that. The division is a problem too though View Quote View All Quotes View All Quotes Quoted:
Quoted:
Quoted:
This line is killing my nbody sim performance on the 16 cores invDist = 1.0f / sqrtf(distSqr); Epiphany being a RISC processor doesn't have division as a built in function, and God knows how that sqrtf() function is being handled. Sigh, the joys of being a coder. Don't use the library sqrt function. Use an iterative Newton's Approximation of Square root to the accuracy you desire. It will run faster than the library function and be accurate to a few significant digits. It all depends in how accurate the sqrt must be, how much error can be tolerated. For visual things, you can use a 2 decimal point square root and be close, while being extremely fast. There are other sqrt algorithms, depending on accuracy vs. speed. The sqrtf() function goes for accuracy. Newton's approximation uses the handy ½, which is a right shift, and the rest is addition and multiplication. The division is a problem too though The div isn't a patch on the slowness of most sqrt libraries, I believe the cores have that instruction built in from what I've read. Is there a code profiler in the SDK, or do you need to do it the old fashioned way? |
|
Quoted: The div isn't a patch on the slowness of most sqrt libraries, I believe the cores have that instruction built in from what I've read. Is there a code profiler in the SDK, or do you need to do it the old fashioned way? View Quote View All Quotes View All Quotes Quoted: Quoted: Quoted: Quoted: This line is killing my nbody sim performance on the 16 cores invDist = 1.0f / sqrtf(distSqr); Epiphany being a RISC processor doesn't have division as a built in function, and God knows how that sqrtf() function is being handled. Sigh, the joys of being a coder. Don't use the library sqrt function. Use an iterative Newton's Approximation of Square root to the accuracy you desire. It will run faster than the library function and be accurate to a few significant digits. It all depends in how accurate the sqrt must be, how much error can be tolerated. For visual things, you can use a 2 decimal point square root and be close, while being extremely fast. There are other sqrt algorithms, depending on accuracy vs. speed. The sqrtf() function goes for accuracy. Newton's approximation uses the handy ½, which is a right shift, and the rest is addition and multiplication. The division is a problem too though The div isn't a patch on the slowness of most sqrt libraries, I believe the cores have that instruction built in from what I've read. Is there a code profiler in the SDK, or do you need to do it the old fashioned way? |
|
I love the internet.
Found this, very fast in my code, loses a little accuracy, but orders of magnitude faster. Does exactly what I am trying to do without division or square root calls. Used in Quake III Arena code. https://en.wikipedia.org/wiki/Fast_inverse_square_root Even has the C code in that wiki, I uncommented that last line for a little better accuracy. The number in red below is the magic sauce and kind of a mystery how someone derived it.
float Q_rsqrt( float number ) |
|
So if you all can plot out the universe movements on this thing can you model or simulate how drug molecules dock into proteins? Now that'd be cool and very useful in pharma world. Just sayin'
|
|
Quoted:
What I did with my Raspberry Pi, and of course I finished it one month before the Raspberry Pi 2 came out which can do N64 games... So I will need to do an update. http://i.imgur.com/cDN8fre.jpg?1 http://i.imgur.com/MhW43K4.jpg?1 http://i.imgur.com/PWcIOqX.jpg?1 Edit: Forgot my startup video! Vertical cam warning! http://youtu.be/xqBEzHY7xy4 View Quote how can a retard make this? ive been looking at buying one off the internet that has hundreds of games and will cost about 4k |
|
Sign up for the ARFCOM weekly newsletter and be entered to win a free ARFCOM membership. One new winner* is announced every week!
You will receive an email every Friday morning featuring the latest chatter from the hottest topics, breaking news surrounding legislation, as well as exclusive deals only available to ARFCOM email subscribers.
AR15.COM is the world's largest firearm community and is a gathering place for firearm enthusiasts of all types.
From hunters and military members, to competition shooters and general firearm enthusiasts, we welcome anyone who values and respects the way of the firearm.
Subscribe to our monthly Newsletter to receive firearm news, product discounts from your favorite Industry Partners, and more.
Copyright © 1996-2024 AR15.COM LLC. All Rights Reserved.
Any use of this content without express written consent is prohibited.
AR15.Com reserves the right to overwrite or replace any affiliate, commercial, or monetizable links, posted by users, with our own.