Warning

 

Close

Confirm Action

Are you sure you wish to do this?

Confirm Cancel
Member Login

Posted: 3/14/2011 2:09:23 PM EST
I've got a column of water temperatures, every minute for a month. 44640 points of data for a 31 day month.

Times six systems, times several years.

I need to know the maximum temperature achieved during the month.

But wait, it isn't that simple.

Occasionally a valve will open and a different cooling pump kicks on, causing a one to three minute temperature spike.

Since these are not normal/typical conditions, I would like to ignore these anomalies.

The initial average temperature might be 60F, then after the brief spike the average temperature might be 40F.

These temperatures vary, and the degree of the spike varies. Might be 120F or only 80F.

What method can I use to get Excel to detect and ignore the spike, so it doesn't get counted as the maximum temperature?
Link Posted: 3/14/2011 2:16:13 PM EST
[Last Edit: 3/14/2011 2:18:16 PM EST by COZ]
Try this forum. The people here are ridicuously smart and have always had a solution for me when I needed it.

http://www.excelforum.com/


Seems "Conditional Formatting" MIGHT work if you tell it to ignore anything greater than whatever max temp.

You also might be able to sort the data based on the temps and delete the cells with the spikes.
Link Posted: 3/14/2011 2:20:16 PM EST
Originally Posted By COZ:
Try this forum. The people here are ridicuously smart and have always had a solution for me when I needed it.

http://www.excelforum.com/



Awesome. Thanks for the link.

Link Posted: 3/14/2011 2:21:02 PM EST
[Last Edit: 3/14/2011 2:25:13 PM EST by NoVaGator]
Normalize the column with a simple IF statement

=if(b2>100,average(b1,b3),b2)


Posted Via AR15.Com Mobile
Link Posted: 3/14/2011 2:23:22 PM EST
Hmm... It's been a few years but I have taken an Excel class. I don't remember doing anything specifically like that.

I opened excel and messed with it but I couldn't figure out how to set it to ignore a set value.

Tag.
Link Posted: 3/14/2011 2:26:24 PM EST
[Last Edit: 3/14/2011 2:27:48 PM EST by arowneragain]
Originally Posted By NoVaGator:
Simple IF statement

=if(b2>100,average(b1,b3),b2)


Posted Via AR15.Com Mobile


That's where I would start.

edit:

Or I would display it graphically with time on one axis and temperature on the other and just eyeball it.
Link Posted: 3/14/2011 2:30:50 PM EST
Originally Posted By arowneragain:
Originally Posted By NoVaGator:
Simple IF statement

=if(b2>100,average(b1,b3),b2)


Posted Via AR15.Com Mobile


That's where I would start.

edit:

Or I would display it graphically with time on one axis and temperature on the other and just eyeball it.


That is what I've been doing, but it is pretty time consuming.
I'd like to spend a little more time finding the right formula, so I can save time in the future.

Link Posted: 3/14/2011 2:34:00 PM EST
I would actually use something more complex than b>100 as the conditional statement, but that's a start

Posted Via AR15.Com Mobile
Link Posted: 3/14/2011 2:35:19 PM EST

Originally Posted By NoVaGator:
Normalize the column with a simple IF statement

=if(b2>100,average(b1,b3),b2)


Posted Via AR15.Com Mobile

That didn't work for me, but I got one to work by making a second column.

Make a second column, with an "=IF(C2<100,C2)". If it is under 100, it will display the number, if not it says "FALSE". Then average the second column of numbers.


Link Posted: 3/14/2011 2:35:24 PM EST
Originally Posted By NoVaGator:
Normalize the column with a simple IF statement

=if(b2>100,average(b1,b3),b2)


Posted Via AR15.Com Mobile


It is a good idea, but the problem I have is for winter months, the average water temps might be 40 degrees.
The spike might only be up to 80 degrees. Or as in January, it jumped to 122F.

In summer, the average might be 70, and the spike might be up to 150.

I've been trying to figure out if there is a way for Excel to look at a minute before and after each data point.
If it is so far above the average of the two, it would be a spike, and ignored.

But imagining it and figuring out how to make it happen are two different things.
Link Posted: 3/14/2011 2:35:39 PM EST
Originally Posted By The_Reaper:
Originally Posted By arowneragain:
Originally Posted By NoVaGator:
Simple IF statement

=if(b2>100,average(b1,b3),b2)


Posted Via AR15.Com Mobile


That's where I would start.

edit:

Or I would display it graphically with time on one axis and temperature on the other and just eyeball it.


That is what I've been doing, but it is pretty time consuming.
I'd like to spend a little more time finding the right formula, so I can save time in the future.



I think I could sit down and write something you could use, but I'd need to know a lot about the data - namely, a full scope of what those spikes looked like, so we could tweak our formulas to encompass the whole spike. Most likely, the best way is to check to see if any given value is beyond a certain function of the preceeding value - i.e., if one minute is more than 1.25x the value of a preceding value, or 1.05x or whatever the amount is, depending on how fast these spikes show up.

Link Posted: 3/14/2011 2:37:10 PM EST
[Last Edit: 3/14/2011 2:55:26 PM EST by GumbyTM]
Can you use the average as a qualifier?

ie.

If(And([testcell]>110%avg(range),[testcell]<80%avv(range)),do stuff,else)

If you post up a data set I'll give you an exact formula.




ETA:

If you fill column A with =Rand() and use this in Col B you'll get my idea.
this uses Col B as a temp row to check data, in this case within 80% to 110% of the average.

Use =Average(B:B) as your cleaned data.

Move the 1.1 and .8 to external cell references to adjust on the fly.

=IF(AND(A5<(1.1*AVERAGE(A:A)),A5>(0.8*AVERAGE(A:A))),A5,"")
Use =Average(B:B) as your cleaned data.

Link Posted: 3/14/2011 2:38:23 PM EST
Originally Posted By The_Reaper:

It is a good idea, but the problem I have is for winter months, the average water temps might be 40 degrees.
The spike might only be up to 80 degrees. Or as in January, it jumped to 122F.

In summer, the average might be 70, and the spike might be up to 150.

I've been trying to figure out if there is a way for Excel to look at a minute before and after each data point.
If it is so far above the average of the two, it would be a spike, and ignored.

But imagining it and figuring out how to make it happen are two different things.


I think my previous post may help you with this.

=if(b2>(b1*1.25),average(b1,b3),b2)
Link Posted: 3/14/2011 2:40:49 PM EST
Originally Posted By AJE:

Originally Posted By NoVaGator:
Normalize the column with a simple IF statement

=if(b2>100,average(b1,b3),b2)


Posted Via AR15.Com Mobile

That didn't work for me, but I got one to work by making a second column.

Make a second column, with an "=IF(C2<100,C2)". If it is under 100, it will display the number, if not it says "FALSE". Then average the second column of numbers.

http://i31.photobucket.com/albums/c353/AJErvin/Forum%20Images/excel.jpg


This would be great, however, lets pick two months like January and August.
A cold and a hot month.
I need this thing to work equally in both environments.

January, for example, had an average of 40 degree water temps until mid month when something in the system changed.
After which the average temps became 60 degrees.

Lets say the data looked like this for January:

40
40
40
65
90
60
60
60

Later that month, it looked like this:

60
60
60
120
80
40
40
40


And in July it was:

60
60
60
120
80
70
70
70


I can't draw a line and pick a number for it to count as the magic number, above which is a temperature spike.

My number could easily be too big, or too small.

Link Posted: 3/14/2011 2:44:32 PM EST
ok, let's say we're on row 10, with your data in column A

this will be in column B

=IF(A10>((average(a1:a9))+(average(a11:a19))/2)*1.20), average(a9,a11), a10)


So what I'm saying is that if A10 is 20% higher than the average of the preceding 9 and the following 9 cells, then use the average of the prior and next cell, otherwise use the A10 number.

(My parentheses may be off....)

Link Posted: 3/14/2011 2:44:43 PM EST
You can either put your temps on a log scale and view on that, or used a capped scale where any number below or above is considered a 3 sigma observation. See "winsonized" mean.
Link Posted: 3/14/2011 2:45:01 PM EST
Originally Posted By The_Reaper:

I can't draw a line and pick a number for it to count as the magic number, above which is a temperature spike.

My number could easily be too big, or too small.



If those numnebrs are real, thus far it seems that 1.35x would fit well.

Link Posted: 3/14/2011 2:47:56 PM EST
Originally Posted By NoVaGator:
ok, let's say we're on row 10, with your data in column A

this will be in column B

=IF(A10>((average(a1:a9))+(average(a11:a19))/2)*1.35), average(a9,a11), a10)


So what I'm saying is that if A10 is 20% higher than the average of the preceding 9 and the following 9 cells, then use the average of the prior and next cell, otherwise use the A10 number.

(My parentheses may be off....)



Try the part in bold, reaper.




NoVaGator,

Parentheses look good.
Link Posted: 3/14/2011 2:55:54 PM EST
Charts and graphs ignore "NA()"

If you're getting spikes because of points of no data, put an if statement in the calculation, and make it return NA() and the chart will ignore it.
Link Posted: 3/14/2011 3:01:43 PM EST
Originally Posted By NoVaGator:
ok, let's say we're on row 10, with your data in column A

this will be in column B

=IF(A10>((average(a1:a9))+(average(a11:a19))/2)*1.20), average(a9,a11), a10)



I think this is very much what I was looking for.

I'll give it a try.

It will have to wait until tomorrow. I can't access my work temp logs from home, and I can't access AR15.com from work.

Thanks!
Link Posted: 3/14/2011 3:08:18 PM EST
Glad I could help

Let us know how it turns out

There are a few more statistically elegant ways to do this, but this should suffice for your purposes


Posted Via AR15.Com Mobile
Link Posted: 3/14/2011 3:13:26 PM EST
this would be better with Pi
Link Posted: 3/14/2011 3:25:24 PM EST

Originally Posted By The_Reaper:
Originally Posted By NoVaGator:
ok, let's say we're on row 10, with your data in column A

this will be in column B

=IF(A10>((average(a1:a9))+(average(a11:a19))/2)*1.20), average(a9,a11), a10)



I think this is very much what I was looking for.

I'll give it a try.

It will have to wait until tomorrow. I can't access my work temp logs from home, and I can't access AR15.com from work.

Thanks!

The only danger here is that since the spike can last for 1-3 minutes and you have one row per minute, the spike could happen on several cells in a row like this:

55
56
120
119
122
57
54

So you wouldn't want average(a9,a11) in the example above because a9 and a11 could also be part of the spike. You need some way to take an average from outside the spike. If you're sure that a spike can be no longer than three minutes and that the previous or next spike will be at least 13 minutes away, then you can slightly change NoVaGator's suggestion and have it take the average of ten cells back (skipping the two prior cells) and ten cells forward (skipping the next two cells).

I would do it in two stages just to avoid repeating complex expressions. The first new column would be the averages from behind and ahead:

cell b12: = ( ( average(a1:a9) ) + (average(a15:a24) ) / 2 )
cell c12 = if( a12 > (b12*1.20), b12, a12 )

When deciding how much over the average counts as a spike, you have to keep in mind that you could be on a spike value but one of the two averages you're comparing against (10 cells back and 10 cells forward) could contain a spike. If the spike drives the average up enough it could make a spike appear normal. But this can't be the case if you know the minimum distance between spikes and your averages don't span that far.
Top Top