Executing Javascript from Python

JavascriptPythonScreen Scraping

Javascript Problem Overview


I have HTML webpages that I am crawling using xpath. The etree.tostring of a certain node gives me this string:

<script>
<!--
function escramble_758(){
  var a,b,c
  a='+1 '
  b='84-'
  a+='425-'
  b+='7450'
  c='9'
  document.write(a+c+b)
}
escramble_758()
//-->
</script>

I just need the output of escramble_758(). I can write a regex to figure out the whole thing, but I want my code to remain tidy. What is the best alternative?

I am zipping through the following libraries, but I didnt see an exact solution. Most of them are trying to emulate browser, making things snail slow.

Edit: An example will be great.. (barebones will do)

Javascript Solutions


Solution 1 - Javascript

You can also use Js2Py which is written in pure python and is able to both execute and translate javascript to python. Supports virtually whole JavaScript even labels, getters, setters and other rarely used features.

import js2py

js = """
function escramble_758(){
var a,b,c
a='+1 '
b='84-'
a+='425-'
b+='7450'
c='9'
document.write(a+c+b)
}
escramble_758()
""".replace("document.write", "return ")

result = js2py.eval_js(js)  # executing JavaScript and converting the result to python string 

Advantages of Js2Py include portability and extremely easy integration with python (since basically JavaScript is being translated to python).

To install:

pip install js2py

Solution 2 - Javascript

Using PyV8, I can do this. However, I have to replace document.write with return because there's no DOM and therefore no document.

import PyV8
ctx = PyV8.JSContext()
ctx.enter()

js = """
function escramble_758(){
var a,b,c
a='+1 '
b='84-'
a+='425-'
b+='7450'
c='9'
document.write(a+c+b)
}
escramble_758()
"""

print ctx.eval(js.replace("document.write", "return "))

Or you could create a mock document object

class MockDocument(object):

    def __init__(self):
        self.value = ''

    def write(self, *args):
        self.value += ''.join(str(i) for i in args)
    

class Global(PyV8.JSClass):
    def __init__(self):
        self.document = MockDocument()

scope = Global()
ctx = PyV8.JSContext(scope)
ctx.enter()
ctx.eval(js)
print scope.document.value

Solution 3 - Javascript

One more solution as PyV8 seems to be unmaintained and dependent on the old version of libv8.

PyMiniRacer It's a wrapper around the v8 engine and it works with the new version and is actively maintained.

pip install py-mini-racer

from py_mini_racer import py_mini_racer
ctx = py_mini_racer.MiniRacer()
ctx.eval("""
function escramble_758(){
    var a,b,c
    a='+1 '
    b='84-'
    a+='425-'
    b+='7450'
    c='9'
    return a+c+b;
}
""")
ctx.call("escramble_758")

And yes, you have to replace document.write with return as others suggested

Solution 4 - Javascript

You can use js2py context to execute your js code and get output from document.write with mock document object:

import js2py

js = """
var output;
document = {
	write: function(value){
		output = value;
	}
}
""" + your_script

context = js2py.EvalJs()
context.execute(js)
print(context.output)

Solution 5 - Javascript

You can use requests-html which will download and use chromium underneath.

from requests_html import HTML

html = HTML(html="<a href='http://www.example.com/'>")

script = """
function escramble_758(){
    var a,b,c
    a='+1 '
    b='84-'
    a+='425-'
    b+='7450'
    c='9'
    return a+c+b;
}
"""

val = html.render(script=script, reload=False)
print(val)
# +1 425-984-7450

More on this read here

Solution 6 - Javascript

quickjs should be the best option after quickjs come out. Just pip install quickjs and you are ready to go.

modify based on the example on README.

from quickjs import Function

js = """
function escramble_758(){
var a,b,c
a='+1 '
b='84-'
a+='425-'
b+='7450'
c='9'
document.write(a+c+b)
escramble_758()
}
"""

escramble_758 = Function('escramble_758', js.replace("document.write", "return "))

print(escramble_758())

https://github.com/PetterS/quickjs

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionjerrymouseView Question on Stackoverflow
Solution 1 - JavascriptPiotr DabkowskiView Answer on Stackoverflow
Solution 2 - JavascriptKien TruongView Answer on Stackoverflow
Solution 3 - JavascriptDienowView Answer on Stackoverflow
Solution 4 - JavascriptMirkoView Answer on Stackoverflow
Solution 5 - JavascriptLevonView Answer on Stackoverflow
Solution 6 - JavascriptechoView Answer on Stackoverflow