CURL

From BR Wiki
Revision as of 18:01, 7 May 2013 by Laura (talk | contribs) (Created page with "{{Messagebox|Please note that Business Rules! 4.20 provides native HTTP support.}} "'''cURL''' is a command line tool for transferring files with URL syntax, su...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
Please note that Business Rules! 4.20 provides native HTTP support.


"cURL is a command line tool for transferring files with URL syntax, supporting FTP, FTPS, HTTP, HTTPS, SCP, SFTP, TFTP, TELNET, DICT, LDAP, LDAPS and FILE. cURL supports SSL certificates, HTTP POST, HTTP PUT, FTP uploading, HTTP form based upload, proxies, cookies, user+password authentication (Basic, Digest, NTLM, Negotiate, kerberos...), file transfer resume, proxy tunneling and a busload of other useful tricks.

Curl is free and open software that compiles and runs under a wide variety of operating systems. Curl exists thanks to efforts from many contributors. " -cURL Home Page

See also: Wikipedia:Web scraping

The basic idea is you call cURL using a system call and you give it a URL and a file name to save the page under. Curl goes to the URL you enter, just as if you entered it in Internet explorer, and it saves the resulting page as an html file.

In this example we wanted to write a function to perform a reverse phone number look-up. I can't send you the whole source code because it is proprietary, but I don't think a little excerpt will hurt. We call Curl and we give it a URL at whitepages.com. We figured out that if you go to the URL: [1] it will display an html page showing who the phone number 817 274 5220 belongs to (which happens to be Nizza Pizza).


 00100 REV_LOOK: ! Reverse Lookup By Phone Number
 00120    def library Fnrev_Look(Number$,Mat Result$)
 00140       let Fnrev_Look=1 ! Assume Success !:
          mat Web_Page$(0)
 00150       if Exists("address.html") then execute "*free address.html"
 00160       execute 'sy -M curl http://www.whitepages.com/15055/search/ReversePhone?phone=' & Number$ & ' -A "Mozilla/4.0" -o address.html -s'
 00200       if Fnread_Page("address.html",Mat Web_Page$) then
 00270          let Response_Type=Fnget_Type(Mat Web_Page$)
 00272          if Response_Type then ! If Respose Type Found !:
                let Fnrev_Look=Fnparse_Page(Response_Type, Mat Web_Page$, Mat Result$) ! Parse It !:
                else !:
                let Fnrev_Look=0 ! Failed To Get Parse Response Type
 00280       else
 00285          let Fnrev_Look=0 ! Failed To Read Page, Check Internet
 00290       end if
 00340 _END_REV_LOOK: fnend

This function takes the phone number to look up, and it builds the URL and passes it to curl in line 160. Line 160 tells Curl to preform the look-up and save the resulting page as address.html in the current directory.

After that, on line 200, we call a function that reads the results into a matrix. After that we call various functions to parse through the matrix looking for the Address information.

During our investigation of the web site we discovered that whitepages.com returns several different web pages depending on if there are 0 result(s), 1 result(s) or many result(s) found. Our parser functions look at the format of the address.html file that curl saved, to determine which type it is. Then, based on that information, it parses the results and builds an address array called Mat Result$ that it returns to the caller.

If you call this function using 817 274 5220, you end up with a mat results$ similar to:

  • Results$(1)="Nizza Pizza & Pasta"
  • Results$(2)="1430 S Cooper St"
  • Results$(3)="Arlington"
  • Results$(4)="TX"
  • Results$(5)="76013"