Scraping data | MATLAB | Forum

Avatar

Please consider registering
Guest

Search

— Forum Scope —






— Match —





— Forum Options —





Minimum search word length is 3 characters - maximum search word length is 84 characters

Register Lost password?
sp_Feed sp_TopicIcon
Scraping data
Avatar
jacktheripper125

Silver
Forum Posts: 15
sp_UserOfflineSmall Offline
1
January 18, 2018 - 8:18 am
sp_Permalink sp_Print sp_EditHistory

Hi I am trying to scrape the data from my temp sensor. it is located at malcomp.mooo.com:1010 

I have put the temperature between span tags like the example but it doest like it.

im sure im just missing something but it just doesnt seem to get any data?

 

any ideas?

 

[code]

% Scrape a website to identify the current temperature . The
% temperature is then written to another ThingSpeak channel.

% Specify the url containing information on current temperature in Natick, MA, U.S.A.

url = 'http://malcomp.mooo.com:1010/index.htm';

% TODO - Replace the [] with channel ID to write data to:
writeChannelID = 405063;

% TODO - Enter the Write API Key between the '' below:
writeAPIKey = 'UGEUH8A5T1V9AFA9';

% Fetch data and parse it to find information of interest. Learn more about
% the URLFILTER function by going to the Documentation tab on the right
% side pane of this page.
temp = urlfilter(url,'Temp');

display(temp, 'Temperature');

[/code]

Avatar
Vinod

Forum Posts: 189
sp_UserOfflineSmall Offline
2
January 18, 2018 - 9:21 am
sp_Permalink sp_Print

Looks like the site is loading some of the DOM dynamically. You will need to do this in two steps:

1) Create a ThingHTTP app

Set the URL to: http://malcomp.mooo.com:1010

Set the Parse String to: //span[2]/text()

Now you can hit this ThingHTTP from a device, or from MATLAB and the result will be the text in the second <span> on the page

2) Create a MATLAB Analysis app with this code:

opts = weboptions('Timeout',15);
data = webread('INSERT API URL FROM STEP 1',opts)

Avatar
jacktheripper125

Silver
Forum Posts: 15
sp_UserOfflineSmall Offline
3
January 18, 2018 - 11:01 am
sp_Permalink sp_Print

I have no idea how that works in the second part but it does. now I need to push the data to the feilds but the examples given dont make sence to me

i have tried the code below, but it is so far off it is probably laughable.

I am trying to get the data to  field   lounge_Temp  and then once I have this I will try to play to get it to do the 4 other sensors.

 

thank you for your help im relived it is at least scraping

 

[code]

% Enter your MATLAB Code below

opts = weboptions('Timeout',15);
data = webread('https://api.thingspeak.com/apps/thinghttp/send_request?api_key=PC4EOCTMNDPPQOSW',opts)
writeChannelID = 405063;

% TODO - Enter the Write API Key between the '' below:
writeAPIKey = 'blahblahblah';

% Fetch data and parse it to find information of interest. Learn more about
% the URLFILTER function by going to the Documentation tab on the right
% side pane of this page.
lounge_Temp = urlfilter(numbers);

display(lounge_Temp, 'Temp');

% Write the temperature data to another channel specified by the
% 'writeChannelID' variable

display(['Note: To successfully write data to another channel, ',...
'assign the write channel ID and API Key to ''writeChannelID'' and ',...
'''writeAPIKey'' variables above. Also uncomment the line of code ',...
'containing ''thingSpeakWrite'' (remove ''%'' sign at the beginning of the line.)'])

% Learn more about the THINGSPEAKWRITE function by going to the Documentation tab on
% the right side pane of this page.

% thingSpeakWrite(writeChannelID, tempF, 'Writekey', writeAPIKey);

[/code]

Avatar
Vinod

Forum Posts: 189
sp_UserOfflineSmall Offline
4
January 18, 2018 - 11:01 am
sp_Permalink sp_Print

Just clarifying the reason you were unable to use WEBREAD to get to the data - the website that was serving the data was serving it on port 1010. This is a non-standard port for HTTP web servers and MATLAB running in the cloud blocks access to these non-standard ports. 

By using the ThingHTTP app we were able to put a redirection from the normal port (port 80) to the non-standard port (1010) on the website serving the data.

Avatar
jacktheripper125

Silver
Forum Posts: 15
sp_UserOfflineSmall Offline
5
January 18, 2018 - 11:30 am
sp_Permalink sp_Print

OOHHH

I only picked 1010 because I could remember it. 80 is being used and a few others so i just picked 1010 as it isnt used by any of my software.

what is the best port to use other than 80

I will move it to that port then.

once i have done that I would like to scrape this data from the page into their 4 fields. can you help with the code to do it?

my fields are 1 lounge_temp 2 Loft_temp 3 Lounge_humid 4 Loft_humid

the page comes out like this below (source)

 

thank you for your help. I think I shot myself in the foot!

  The Fan is 0<br>
  <span>Temp: </span>
  <span>
  18
  </span>
  <span>Temp2: </span>
  <span>
  7
  </span>
  <span>humidity: </span>
  <span>
  60
  </span>
  <span>humidity2: </span>
  <span>
  81
  </span>
  <br>
  Click <a href="/H">here</a> turn the Fan on<br>
  Click <a href="/L">here</a> turn the Fan off<br>
Avatar
Vinod

Forum Posts: 189
sp_UserOfflineSmall Offline
6
January 18, 2018 - 11:43 am
sp_Permalink sp_Print

If you're not using port 80, you will need to use ThingHTTP to essentially proxy the data from port 1010 to port 80. I showed in the example above, the Temperature, 18, is in the second span, which is why I set the Parse String to extract the value in the second <span>.

 

You can create any number of ThingHTTPs, one for each field.

Say you want humidity2, you will change the Parse String to 

 //span[8]/text()

This parses the text and pulls the value within the 8th span.

 

Hope this helps explain the concept so you can modify accordingly.

Avatar
jacktheripper125

Silver
Forum Posts: 15
sp_UserOfflineSmall Offline
7
January 18, 2018 - 12:02 pm
sp_Permalink sp_Print

 super 3 questions please im nearly there.

1 how often will the thinghttp or the app poll for infromation and where to I change it

2 can I remove the span and just look for the temp: before the temperature and temp2: before the second (this way I can remove the spans from the code saving some software space as im pushing my luck on it

3 how do I get that info to the correct fields in analysis

 

I am aware of my stupidity with this but Im so upside down with C+ my brain isnt coping well

 

thank you for your patience and help 

Avatar
jacktheripper125

Silver
Forum Posts: 15
sp_UserOfflineSmall Offline
8
January 18, 2018 - 1:02 pm
sp_Permalink sp_Print

Ok Im nearly nearly there. 

1 I set that easy to answer 

2 yes i can

3 still cant figure

my output is now as below

but I cant figure the command to take the 4 numbers and fill them into the for fields.

could anyone tell me what is needed? I have got part way with  thingSpeakWrite(xxxxxx, analyzedData, 'WriteKey', xxxxxxxxxx);

its the analyseddata bit im stuck on?

 

data =

'
17
6
61
83
'

Avatar
cstapels
Moderator
Forum Posts: 158
sp_UserOfflineSmall Offline
9
January 18, 2018 - 1:50 pm
sp_Permalink sp_Print

have a look at the documentation for thingSpeakWrite.  There are some good examples there.

I think you want to use the name,value pair 'fields'

thingSpeakWrite(xxxxxx,'values', analyzedData, 'WriteKey', xxxxxxxxxx,'fields',[1 2 3 4]);

You may have to transpose analyzedData, depending on what shape it is in.  Use and apostrophe after the variable to transpose it.  -> analyzedData'

Avatar
jacktheripper125

Silver
Forum Posts: 15
sp_UserOfflineSmall Offline
10
January 18, 2018 - 1:59 pm
sp_Permalink sp_Print sp_EditHistory

Hi thank you for your reply

I looked at the examples over and over and thought they would work but they only seem to be this sort 

 

thingSpeakWrite(17504,[2.3,1.2,3.2,0.1],'WriteKey','23ZLGOBBU9TWHG2H')

perfect I thougt so off i went to use it and realised it was just filling in the data 2.3 to the first and so on and I dont know how to turn the output into a string,
i tried  your command but it said

Undefined function or variable 'analyzedData'.?


is no one else doing this? scraping data off a webpage and putting it into thingspeak? 
a big thank you for your help
Avatar
cstapels
Moderator
Forum Posts: 158
sp_UserOfflineSmall Offline
11
January 19, 2018 - 1:29 pm
sp_Permalink sp_Print

Can you show what the value or some sample values you have for analyzedData?  or show the command where you scrape web data?

You need to be pretty explicit about format so that thingSpeak can put stuff where you want it.

perhaps consider this format : 

thingSpeakWrite(17504,'Fields',[1,4,6],'Values',{2.3,'on','good'},'WriteKey','23ZLGOBBU9TWHG2H')

The result of the above will be an entry as follows

Time              field1  field2  field3  field4  field5  field6 field7 field8 status location

{write time}     2.3                            on              good

 

Is that the effect you are looking for?

Avatar
jacktheripper125

Silver
Forum Posts: 15
sp_UserOfflineSmall Offline
12
January 19, 2018 - 1:48 pm
sp_Permalink sp_Print

Kind of I saw an example like that but every example I see are not taking values from the output and putting them into the channel. they all say the values in the code and I dont see the point in that. the output it sees are

data =

'Lounge_Temp
17
Loft_Temp
6
Lounge_Humidity
64
Loft_Humidity
88'

I have Fields with the names Lounge_Temp and Loft_Temp etc... so I need the value after putting in its field. and every time I try to do this I get an error. 

I would have thought this is a normal thing to do but I cant find any examples?

Hopefully you can help?

Avatar
jacktheripper125

Silver
Forum Posts: 15
sp_UserOfflineSmall Offline
13
January 19, 2018 - 4:31 pm
sp_Permalink sp_Print sp_EditHistory

If anyone is interested I fixed it, rather than trying to do all the data in one go and feed it to the channel I made 4 Matlab apps and have set them to keep requesting data.

this is the code that worked. 

 

% Template MATLAB code for reading numeric data from a webpage, analyzing
% the data and storing the analyzed data in a channel.

% Prior to running this MATLAB code template, assign the url for the
% webpage to scrape to the 'url' variable. Also assign the target string to
% search for in the web page to the 'targetString' variable.

% To store the scraped data, you will need to write it to a channel other
% than the one you are reading data from. Assign this channel ID to the
% 'writeChannelID' variable. Also assign the write API Key to the
% 'writeAPIKey' variable below. You can find the write API Key in the right
% side pane of this page as well.

% TODO - Specify URL of the page to read data from
url = 'https://api.thingspeak.com/apps/thinghttp/send_request?api_key=xxxxxxxxxxxxxxx';
% TODO - Specify the target string to search in webpage
targetString = 'Lounge_Temp';

% TODO - Replace the [] with channel ID to write data to:
writeChannelID = xxxxxx;
% TODO - Enter the Write API Key between the '' below:
writeAPIKey = 'xxxxxxxxxxx';

%% Scrape the webpage %%
data = urlfilter(url, targetString);
display(data);

%% Analyze Data %%
% Add code in this section to analyze data and store the result in the
% analyzedData variable.
analyzedData = data;

%% Write Data %%
thingSpeakWrite(writeChannelID, {analyzedData,'Lounge_Temp'}, 'WriteKey', writeAPIKey);

Avatar
Vinod

Forum Posts: 189
sp_UserOfflineSmall Offline
14
January 19, 2018 - 9:33 pm
sp_Permalink sp_Print

You can use a single MATLAB analysis to scrape your data, parse the different fields and write it to 4 fields of a channel, or 4 different channels, if that is what you wish.

I was trying to write the example for you, but your website is malcomp.mooo.com:1010 does not seem to be up.

Avatar
Vinod

Forum Posts: 189
sp_UserOfflineSmall Offline
15
January 19, 2018 - 11:39 pm
sp_Permalink sp_Print sp_EditHistory

I set up a ThingHTTP that reads the #T1 element on your page to get its data.

Here's the simple MATLAB Analysis App example that parses the data and writes it into fields 1 through 4 of your channel in a single write.

opts = weboptions('Timeout',18);
data = webread('https://api.thingspeak.com/apps/thinghttp/send_request?api_key=USE YOUR THINGHTTP API KEY HERE',opts);
d2 = strsplit(data, char(13));
LoungeTemp = str2double(regexprep(d2{2},'[^0-9]', ''));
LoftTemp = str2double(regexprep(d2{4},'[^0-9]', ''));
LoungeHumidity = str2double(regexprep(d2{6},'[^0-9]', ''));
LoftHumidity = str2double(regexprep(d2{8},'[^0-9]', ''));
thingSpeakWrite(YOURCHANNELIDHERE, [LoungeTemp, LoftTemp, LoungeHumidity, LoftHumidity], 'WriteKey', 'USE YOUR CHANNEL WRITE API KEY HERE');

 

If you're new to MATLAB, you can search for what some of the functions like STRREP, STRSPLIT, STR2DOUBLE, REGEXPREP, etc. do here.

Avatar
jacktheripper125

Silver
Forum Posts: 15
sp_UserOfflineSmall Offline
16
February 6, 2018 - 6:10 pm
sp_Permalink sp_Print

WOW!

 

thank you.

it works. i would never had got to that point, to many syntax traps for me in that code!

 

thank you thank you thank you! 

Forum Timezone: America/New_York

Most Users Ever Online: 114

Currently Online:
36 Guest(s)

Currently Browsing this Page:
1 Guest(s)

Top Posters:

rw950431: 252

Vinod: 164

cstapels: 96

piajola: 75

vespapierre: 63

Adarsh_Murthy: 59

Newest Members:

ibgedubo

aogomaeqaza

ureomug

itixuwidouzuv

eilimir

memozupi

Forum Stats:

Groups: 4

Forums: 17

Topics: 1313

Posts: 4565

 

Member Stats:

Guest Posters: 1

Members: 5703

Moderators: 0

Admins: 2

Administrators: Hans, lee