how to get the cookies from a php curl into a variable

PhpCookiesCurl

Php Problem Overview


So some guy at some other company thought it would be awesome if instead of using soap or xml-rpc or rest or any other reasonable communication protocol he just embedded all of his response as cookies in the header.

I need to pull these cookies out as hopefully an array from this curl response. If I have to waste a bunch of my life writing a parser for this I will be very unhappy.

Does anyone know how this can simply be done, preferably without writing anything to a file?

I will be very grateful if anyone can help me out with this.

Php Solutions


Solution 1 - Php

$ch = curl_init('http://www.google.com/');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
// get headers too with this line
curl_setopt($ch, CURLOPT_HEADER, 1);
$result = curl_exec($ch);
// get cookie
// multi-cookie variant contributed by @Combuster in comments
preg_match_all('/^Set-Cookie:\s*([^;]*)/mi', $result, $matches);
$cookies = array();
foreach($matches[1] as $item) {
    parse_str($item, $cookie);
    $cookies = array_merge($cookies, $cookie);
}
var_dump($cookies);

Solution 2 - Php

Although this question is quite old, and the accepted response is valid, I find it a bit unconfortable because the content of the HTTP response (HTML, XML, JSON, binary or whatever) becomes mixed with the headers.

I've found a different alternative. CURL provides an option (CURLOPT_HEADERFUNCTION) to set a callback that will be called for each response header line. The function will receive the curl object and a string with the header line.

You can use a code like this (adapted from TML response):

$cookies = Array();
$ch = curl_init('http://www.google.com/');
// Ask for the callback.
curl_setopt($ch, CURLOPT_HEADERFUNCTION, "curlResponseHeaderCallback");
$result = curl_exec($ch);
var_dump($cookies);

function curlResponseHeaderCallback($ch, $headerLine) {
    global $cookies;
    if (preg_match('/^Set-Cookie:\s*([^;]*)/mi', $headerLine, $cookie) == 1)
        $cookies[] = $cookie;
    return strlen($headerLine); // Needed by curl
}

This solution has the drawback of using a global variable, but I guess this is not an issue for short scripts. And you can always use static methods and attributes if curl is being wrapped into a class.

Solution 3 - Php

This does it without regexps, but requires the PECL HTTP extension.

curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
$result = curl_exec($ch);
curl_close($ch);

$headers = http_parse_headers($result);
$cookobjs = Array();
foreach($headers AS $k => $v){
	if (strtolower($k)=="set-cookie"){
		foreach($v AS $k2 => $v2){
			$cookobjs[] = http_parse_cookie($v2);
		}
	}
}

$cookies = Array();
foreach($cookobjs AS $row){
	$cookies[] = $row->cookies;
}

$tmp = Array();
// sort k=>v format
foreach($cookies AS $v){
	foreach ($v  AS $k1 => $v1){
		$tmp[$k1]=$v1;
	}
}

$cookies = $tmp;
print_r($cookies);

Solution 4 - Php

If you use CURLOPT_COOKIE_FILE and CURLOPT_COOKIE_JAR curl will read/write the cookies from/to a file. You can, after curl is done with it, read and/or modify it however you want.

Solution 5 - Php

libcurl also provides CURLOPT_COOKIELIST which extracts all known cookies. All you need is to make sure the PHP/CURL binding can use it.

Solution 6 - Php

someone here may find it useful. hhb_curl_exec2 works pretty much like curl_exec, but arg3 is an array which will be populated with the returned http headers (numeric index), and arg4 is an array which will be populated with the returned cookies ($cookies["expires"]=>"Fri, 06-May-2016 05:58:51 GMT"), and arg5 will be populated with... info about the raw request made by curl.

the downside is that it requires CURLOPT_RETURNTRANSFER to be on, else it error out, and that it will overwrite CURLOPT_STDERR and CURLOPT_VERBOSE, if you were already using them for something else.. (i might fix this later)

example of how to use it:

<?php
header("content-type: text/plain;charset=utf8");
$ch=curl_init();
$headers=array();
$cookies=array();
$debuginfo="";
$body="";
curl_setopt($ch,CURLOPT_SSL_VERIFYPEER,false);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,true);
$body=hhb_curl_exec2($ch,'https://www.youtube.com/',$headers,$cookies,$debuginfo);
var_dump('$cookies:',$cookies,'$headers:',$headers,'$debuginfo:',$debuginfo,'$body:',$body);

and the function itself..

function hhb_curl_exec2($ch, $url, &$returnHeaders = array(), &$returnCookies = array(), &$verboseDebugInfo = "")
{
    $returnHeaders    = array();
    $returnCookies    = array();
    $verboseDebugInfo = "";
    if (!is_resource($ch) || get_resource_type($ch) !== 'curl') {
        throw new InvalidArgumentException('$ch must be a curl handle!');
    }
    if (!is_string($url)) {
        throw new InvalidArgumentException('$url must be a string!');
    }
    $verbosefileh = tmpfile();
    $verbosefile  = stream_get_meta_data($verbosefileh);
    $verbosefile  = $verbosefile['uri'];
    curl_setopt($ch, CURLOPT_VERBOSE, 1);
    curl_setopt($ch, CURLOPT_STDERR, $verbosefileh);
    curl_setopt($ch, CURLOPT_HEADER, 1);
    $html             = hhb_curl_exec($ch, $url);
    $verboseDebugInfo = file_get_contents($verbosefile);
    curl_setopt($ch, CURLOPT_STDERR, NULL);
    fclose($verbosefileh);
    unset($verbosefile, $verbosefileh);
    $headers       = array();
    $crlf          = "\x0d\x0a";
    $thepos        = strpos($html, $crlf . $crlf, 0);
    $headersString = substr($html, 0, $thepos);
    $headerArr     = explode($crlf, $headersString);
    $returnHeaders = $headerArr;
    unset($headersString, $headerArr);
    $htmlBody = substr($html, $thepos + 4); //should work on utf8/ascii headers... utf32? not so sure..
    unset($html);
    //I REALLY HOPE THERE EXIST A BETTER WAY TO GET COOKIES.. good grief this looks ugly..
    //at least it's tested and seems to work perfectly...
    $grabCookieName = function($str)
    {
        $ret = "";
        $i   = 0;
        for ($i = 0; $i < strlen($str); ++$i) {
            if ($str[$i] === ' ') {
                continue;
            }
            if ($str[$i] === '=') {
                break;
            }
            $ret .= $str[$i];
        }
        return urldecode($ret);
    };
    foreach ($returnHeaders as $header) {
        //Set-Cookie: crlfcoookielol=crlf+is%0D%0A+and+newline+is+%0D%0A+and+semicolon+is%3B+and+not+sure+what+else
        /*Set-Cookie:ci_spill=a%3A4%3A%7Bs%3A10%3A%22session_id%22%3Bs%3A32%3A%22305d3d67b8016ca9661c3b032d4319df%22%3Bs%3A10%3A%22ip_address%22%3Bs%3A14%3A%2285.164.158.128%22%3Bs%3A10%3A%22user_agent%22%3Bs%3A109%3A%22Mozilla%2F5.0+%28Windows+NT+6.1%3B+WOW64%29+AppleWebKit%2F537.36+%28KHTML%2C+like+Gecko%29+Chrome%2F43.0.2357.132+Safari%2F537.36%22%3Bs%3A13%3A%22last_activity%22%3Bi%3A1436874639%3B%7Dcab1dd09f4eca466660e8a767856d013; expires=Tue, 14-Jul-2015 13:50:39 GMT; path=/
        Set-Cookie: sessionToken=abc123; Expires=Wed, 09 Jun 2021 10:18:14 GMT;
        //Cookie names cannot contain any of the following '=,; \t\r\n\013\014'
        //
        */
        if (stripos($header, "Set-Cookie:") !== 0) {
            continue;
            /**/
        }
        $header = trim(substr($header, strlen("Set-Cookie:")));
        while (strlen($header) > 0) {
            $cookiename                 = $grabCookieName($header);
            $returnCookies[$cookiename] = '';
            $header                     = substr($header, strlen($cookiename) + 1); //also remove the = 
            if (strlen($header) < 1) {
                break;
            }
            ;
            $thepos = strpos($header, ';');
            if ($thepos === false) { //last cookie in this Set-Cookie.
                $returnCookies[$cookiename] = urldecode($header);
                break;
            }
            $returnCookies[$cookiename] = urldecode(substr($header, 0, $thepos));
            $header                     = trim(substr($header, $thepos + 1)); //also remove the ;
        }
    }
    unset($header, $cookiename, $thepos);
    return $htmlBody;
}

function hhb_curl_exec($ch, $url)
{
    static $hhb_curl_domainCache = "";
    //$hhb_curl_domainCache=&$this->hhb_curl_domainCache;
    //$ch=&$this->curlh;
    if (!is_resource($ch) || get_resource_type($ch) !== 'curl') {
        throw new InvalidArgumentException('$ch must be a curl handle!');
    }
    if (!is_string($url)) {
        throw new InvalidArgumentException('$url must be a string!');
    }
    
    $tmpvar = "";
    if (parse_url($url, PHP_URL_HOST) === null) {
        if (substr($url, 0, 1) !== '/') {
            $url = $hhb_curl_domainCache . '/' . $url;
        } else {
            $url = $hhb_curl_domainCache . $url;
        }
    }
    ;
    
    curl_setopt($ch, CURLOPT_URL, $url);
    $html = curl_exec($ch);
    if (curl_errno($ch)) {
        throw new Exception('Curl error (curl_errno=' . curl_errno($ch) . ') on url ' . var_export($url, true) . ': ' . curl_error($ch));
        // echo 'Curl error: ' . curl_error($ch);
    }
    if ($html === '' && 203 != ($tmpvar = curl_getinfo($ch, CURLINFO_HTTP_CODE)) /*203 is "success, but no output"..*/ ) {
        throw new Exception('Curl returned nothing for ' . var_export($url, true) . ' but HTTP_RESPONSE_CODE was ' . var_export($tmpvar, true));
    }
    ;
    //remember that curl (usually) auto-follows the "Location: " http redirects..
    $hhb_curl_domainCache = parse_url(curl_getinfo($ch, CURLINFO_EFFECTIVE_URL), PHP_URL_HOST);
    return $html;
}

Solution 7 - Php

The accepted answer seems like it will search through the entire response message. This could give you false matches for cookie headers if the word "Set-Cookie" is at the beginning of a line. While it should be fine in most cases. The safer way might be to read through the message from the beginning until the first empty line which indicates the end of the message headers. This is just an alternate solution that should look for the first blank line and then use preg_grep on those lines only to find "Set-Cookie".

    curl_setopt($ch, CURLOPT_HEADER, 1);
    //Return everything
    $res = curl_exec($ch);
    //Split into lines
    $lines = explode("\n", $res);
    $headers = array();
    $body = "";
    foreach($lines as $num => $line){
        $l = str_replace("\r", "", $line);
        //Empty line indicates the start of the message body and end of headers
        if(trim($l) == ""){
            $headers = array_slice($lines, 0, $num);
            $body = $lines[$num + 1];
            //Pull only cookies out of the headers
            $cookies = preg_grep('/^Set-Cookie:/', $headers);
            break;
        }
    }

Solution 8 - Php

My understanding is that cookies from curl must be written out to a file (curl -c cookie_file). If you're running curl through PHP's exec or system functions (or anything in that family), you should be able to save the cookies to a file, then open the file and read them in.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionthirsty93View Question on Stackoverflow
Solution 1 - PhpTMLView Answer on Stackoverflow
Solution 2 - PhpGoogolView Answer on Stackoverflow
Solution 3 - PhpAlex PView Answer on Stackoverflow
Solution 4 - PhpSoapBoxView Answer on Stackoverflow
Solution 5 - PhpDaniel StenbergView Answer on Stackoverflow
Solution 6 - PhphanshenrikView Answer on Stackoverflow
Solution 7 - PhpRich WandellView Answer on Stackoverflow
Solution 8 - PhpkyleView Answer on Stackoverflow