How can I download all emails with attachments from Gmail?

Gmail

Gmail Problem Overview


How do I connect to Gmail and determine which messages have attachments? I then want to download each attachment, printing out the Subject: and From: for each message as I process it.

Gmail Solutions


Solution 1 - Gmail

Hard one :-)

import email, getpass, imaplib, os

detach_dir = '.' # directory where to save attachments (default: current)
user = raw_input("Enter your GMail username:")
pwd = getpass.getpass("Enter your password: ")

# connecting to the gmail imap server
m = imaplib.IMAP4_SSL("imap.gmail.com")
m.login(user,pwd)
m.select("[Gmail]/All Mail") # here you a can choose a mail box like INBOX instead
# use m.list() to get all the mailboxes
 
resp, items = m.search(None, "ALL") # you could filter using the IMAP rules here (check http://www.example-code.com/csharp/imap-search-critera.asp)
items = items[0].split() # getting the mails id

for emailid in items:
    resp, data = m.fetch(emailid, "(RFC822)") # fetching the mail, "`(RFC822)`" means "get the whole stuff", but you can ask for headers only, etc
    email_body = data[0][1] # getting the mail content
    mail = email.message_from_string(email_body) # parsing the mail content to get a mail object

    #Check if any attachments at all
    if mail.get_content_maintype() != 'multipart':
        continue

    print "["+mail["From"]+"] :" + mail["Subject"]

    # we use walk to create a generator so we can iterate on the parts and forget about the recursive headach
    for part in mail.walk():
        # multipart are just containers, so we skip them
        if part.get_content_maintype() == 'multipart':
            continue

        # is this part an attachment ?
        if part.get('Content-Disposition') is None:
            continue

        filename = part.get_filename()
        counter = 1

        # if there is no filename, we create one with a counter to avoid duplicates
        if not filename:
            filename = 'part-%03d%s' % (counter, 'bin')
            counter += 1

        att_path = os.path.join(detach_dir, filename)

        #Check if its already there
        if not os.path.isfile(att_path) :
            # finally write the stuff
            fp = open(att_path, 'wb')
            fp.write(part.get_payload(decode=True))
            fp.close()

Wowww! That was something. ;-) But try the same in Java, just for fun!

By the way, I tested that in a shell, so some errors likely remain.

Enjoy

EDIT:

Because mail-box names can change from one country to another, I recommend doing m.list() and picking an item in it before m.select("the mailbox name") to avoid this error:

> imaplib.error: command SEARCH illegal in state AUTH, only allowed in > states SELECTED

Solution 2 - Gmail

I'm not an expert on Perl, but what I do know is that GMail supports IMAP and POP3, 2 protocols that are completely standard and allow you to do just that.

Maybe that helps you to get started.

Solution 3 - Gmail

#!/usr/bin/env python
"""Save all attachments for given gmail account."""
import os, sys
from libgmail import GmailAccount

ga = GmailAccount("[email protected]", "pA$$w0Rd_")
ga.login()

# folders: inbox, starred, all, drafts, sent, spam
for thread in ga.getMessagesByFolder('all', allPages=True):
    for msg in thread:
        sys.stdout.write('.')
        if msg.attachments:
           print "\n", msg.id, msg.number, msg.subject, msg.sender
           for att in msg.attachments:
               if att.filename and att.content:
                  attdir = os.path.join(thread.id, msg.id)
                  if not os.path.isdir(attdir):
                     os.makedirs(attdir)                
                  with open(os.path.join(attdir, att.filename), 'wb') as f:
                       f.write(att.content)

untested

  1. Make sure TOS allows such scripts otherwise you account will be suspended
  2. There might be better options: GMail offline mode, Thunderbird + ExtractExtensions, GmailFS, Gmail Drive, etc.

Solution 4 - Gmail

Take a look at Mail::Webmail::Gmail:

GETTING ATTACHMENTS

There are two ways to get an attachment:

1 -> By sending a reference to a specific attachment returned by get_indv_email

# Creates an array of references to every attachment in your account
my $messages = $gmail->get_messages();
my @attachments;

foreach ( @{ $messages } ) {
    my $email = $gmail->get_indv_email( msg => $_ );
    if ( defined( $email->{ $_->{ 'id' } }->{ 'attachments' } ) ) {
        foreach ( @{ $email->{ $_->{ 'id' } }->{ 'attachments' } } ) {
            push( @attachments, $gmail->get_attachment( attachment => $_ ) );
            if ( $gmail->error() ) {
                print $gmail->error_msg();
            }
        }
    }
}

2 -> Or by sending the attachment ID and message ID

#retrieve specific attachment
my $msgid = 'F000000000';
my $attachid = '0.1';
my $attach_ref = $gmail->get_attachment( attid => $attachid, msgid => $msgid );

( Returns a reference to a scalar that holds the data from the attachment. )

Solution 5 - Gmail

Within gmail, you can filter on "has:attachment", use it to identify the messages you should be getting when testing. Note this appears to give both messages with attached files (paperclip icon shown), as well as inline attached images (no paperclip shown).

There is no Gmail API, so IMAP or POP are your only real options. The JavaMail API may be of some assistance as well as this very terse article on downloading attachments from IMAP using Perl. Some previous questions here on SO may also help.

This PHP example may help too. Unfortunately from what I can see, there is no attachment information contained within the imap_header, so downloading the body is required to be able to see the X-Attachment-Id field. (someone please prove me wrong).

Solution 6 - Gmail

The question is quite old and at that time Gmail API was not available. But now Google provides Gmail API to access IMAP. See Google's Gmail API here. Also see google-api-python-client on pypi.

Solution 7 - Gmail

If any of you have updated to python 3.3 I took the 2.7 script from HERE and updated it to 3.3. Also fixed some issues with the way gmail was returning the information.

# Something in lines of http://stackoverflow.com/questions/348630/how-can-i-download-all-emails-with-attachments-from-gmail
# Make sure you have IMAP enabled in your gmail settings.
# Right now it won't download same file name twice even if their contents are different.
# Gmail as of now returns in bytes but just in case they go back to string this line is left here.

import email
import getpass, imaplib
import os
import sys
import time
 
detach_dir = '.'
if 'attachments' not in os.listdir(detach_dir):
	os.mkdir('attachments')
 
userName = input('Enter your GMail username:\n')
passwd = getpass.getpass('Enter your password:\n')


try:
	imapSession = imaplib.IMAP4_SSL('imap.gmail.com',993)
	typ, accountDetails = imapSession.login(userName, passwd)
	if typ != 'OK':
		print ('Not able to sign in!')
		raise
 
	imapSession.select('Inbox')
	typ, data = imapSession.search(None, 'ALL')
	if typ != 'OK':
		print ('Error searching Inbox.')
		raise

	# Iterating over all emails
	for msgId in data[0].split():
		typ, messageParts = imapSession.fetch(msgId, '(RFC822)')

		if typ != 'OK':
			print ('Error fetching mail.')
			raise 
		
		#print(type(emailBody))
		emailBody = messageParts[0][1]
		#mail = email.message_from_string(emailBody)
		mail = email.message_from_bytes(emailBody)

		for part in mail.walk():
			#print (part)
			if part.get_content_maintype() == 'multipart':
				# print part.as_string()
				continue
			if part.get('Content-Disposition') is None:
				# print part.as_string()
				continue
			
			fileName = part.get_filename()
					
			if bool(fileName):
				filePath = os.path.join(detach_dir, 'attachments', fileName)
				if not os.path.isfile(filePath) :
					print (fileName)
					fp = open(filePath, 'wb')
					fp.write(part.get_payload(decode=True))
					fp.close()
					
	imapSession.close()
	imapSession.logout()
	
except :
	print ('Not able to download all attachments.')
	time.sleep(3)

Solution 8 - Gmail

Since Gmail supports the standard protocols POP and IMAP, any platform, tool, application, component, or API that provides the client side of either protocol should work.

I suggest doing a Google search for your favorite language/platform (e.g., "python"), plus "pop", plus "imap", plus perhaps "open source", plus perhaps "download" or "review", and see what you get for options.

There are numerous free applications and components, pick a few that seem worthy, check for reviews, then download and enjoy.

Solution 9 - Gmail

You should be aware of the fact that you need SSL to connect to GMail (both for POP3 and IMAP - this is of course true also for their SMTP-servers apart from port 25 but that's another story).

Solution 10 - Gmail

Here's something I wrote to download my bank statements in Groovy (dynamic language for the Java Platform).

import javax.mail.*
import java.util.Properties

String  gmailServer
int gmailPort
def user, password, LIMIT
def inboxFolder, root, StartDate, EndDate


//    Downloads all attachments from a gmail mail box as per some criteria
//    to a specific folder
//    Based on code from
//    http://agileice.blogspot.com/2008/10/using-groovy-to-connect-to-gmail.html
//    http://stackoverflow.com/questions/155504/download-mail-attachment-with-java
//
//    Requires: 
//        java mail jars in the class path (mail.jar and activation.jar)
//        openssl, with gmail certificate added to java keystore (see agileice blog)
//        
//    further improvement: maybe findAll could be used to filter messages
//    subject could be added as another criteria
////////////////////// <CONFIGURATION> //////////////////////
// Maximm number of emails to access in case parameter range is too high
LIMIT = 10000

// gmail credentials
gmailServer = "imap.gmail.com"
gmailPort = 993

user = "[email protected]"
password = "gmailpassword"

// gmail label, or "INBOX" for inbox
inboxFolder = "finance"

// local file system where the attachment files need to be stored
root = "D:\\AttachmentStore" 

// date range dd-mm-yyyy
StartDate= "31-12-2009"
EndDate = "1-6-2010" 
////////////////////// </CONFIGURATION> //////////////////////

StartDate = Date.parse("dd-MM-yyyy", StartDate)
EndDate = Date.parse("dd-MM-yyyy", EndDate)

Properties props = new Properties();
props.setProperty("mail.store.protocol", "imaps");
props.setProperty("mail.imaps.host", gmailServer);
props.setProperty("mail.imaps.port", gmailPort.toString());
props.setProperty("mail.imaps.partialfetch", "false");

def session = javax.mail.Session.getDefaultInstance(props,null)
def store = session.getStore("imaps")

store.connect(gmailServer, user, password)

int i = 0;
def folder = store.getFolder(inboxFolder)

folder.open(Folder.READ_ONLY)

for(def msg : folder.messages) {

     //if (msg.subject?.contains("bank Statement"))
     println "[$i] From: ${msg.from} Subject: ${msg.subject} -- Received: ${msg.receivedDate}"
     
     if (msg.receivedDate <  StartDate || msg.receivedDate > EndDate) {
         println "Ignoring due to date range"
         continue
     }
     
     
     if (msg.content instanceof Multipart) {
         Multipart mp = (Multipart)msg.content;
         
         for (int j=0; j < mp.count; j++) {
         
             Part part = mp.getBodyPart(j);
             
             println " ---- ${part.fileName} ---- ${part.disposition}"
             
             if (part.disposition?.equalsIgnoreCase(Part.ATTACHMENT)) {

                 if (part.content) {
                     
                     def name = msg.receivedDate.format("yyyy_MM_dd") + " " + part.fileName
                     println "Saving file to $name"
                     
                     def f = new File(root, name)

                     //f << part.content
                     try {
                         if (!f.exists())
                             f << part.content
                     }
                     catch (Exception e) {
                         println "*** Error *** $e" 
                     }
                 }
                 else {
                    println "NO Content Found!!"
                 }
             }
         }
     }
     
     if (i++ > LIMIT)
         break;

}

Solution 11 - Gmail

/*based on http://www.codejava.net/java-ee/javamail/using-javamail-for-searching-e-mail-messages*/
package getMailsWithAtt;

import java.io.File;
import java.io.IOException;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.Properties;

import javax.mail.Address;
import javax.mail.Folder;
import javax.mail.Message;
import javax.mail.MessagingException;
import javax.mail.Multipart;
import javax.mail.NoSuchProviderException;
import javax.mail.Part;
import javax.mail.Session;
import javax.mail.Store;
import javax.mail.internet.MimeBodyPart;
import javax.mail.search.AndTerm;
import javax.mail.search.SearchTerm;
import javax.mail.search.ReceivedDateTerm;
import javax.mail.search.ComparisonTerm;

public class EmailReader {
	private String saveDirectory;

	/**
	 * Sets the directory where attached files will be stored.
	 * 
	 * @param dir
	 *            absolute path of the directory
	 */
	public void setSaveDirectory(String dir) {
		this.saveDirectory = dir;
	}

	/**
	 * Downloads new messages and saves attachments to disk if any.
	 * 
	 * @param host
	 * @param port
	 * @param userName
	 * @param password
	 * @throws IOException
	 */
	public void downloadEmailAttachments(String host, String port,
			String userName, String password, Date startDate, Date endDate) {
		Properties props = System.getProperties();
		props.setProperty("mail.store.protocol", "imaps");
		try {
			Session session = Session.getDefaultInstance(props, null);
			Store store = session.getStore("imaps");
			store.connect("imap.gmail.com", userName, password);
			// ...
			Folder inbox = store.getFolder("INBOX");
			inbox.open(Folder.READ_ONLY);
			SearchTerm olderThan = new ReceivedDateTerm (ComparisonTerm.LT, startDate);
			SearchTerm newerThan = new ReceivedDateTerm (ComparisonTerm.GT, endDate);
			SearchTerm andTerm = new AndTerm(olderThan, newerThan);
			//Message[] arrayMessages = inbox.getMessages(); <--get all messages
			Message[] arrayMessages = inbox.search(andTerm);
			for (int i = arrayMessages.length; i > 0; i--) { //from newer to older
				Message msg = arrayMessages[i-1];
				Address[] fromAddress = msg.getFrom();
				String from = fromAddress[0].toString();
				String subject = msg.getSubject();
				String sentDate = msg.getSentDate().toString();
				String receivedDate = msg.getReceivedDate().toString();

				String contentType = msg.getContentType();
				String messageContent = "";

				// store attachment file name, separated by comma
				String attachFiles = "";

				if (contentType.contains("multipart")) {
					// content may contain attachments
					Multipart multiPart = (Multipart) msg.getContent();
					int numberOfParts = multiPart.getCount();
					for (int partCount = 0; partCount < numberOfParts; partCount++) {
						MimeBodyPart part = (MimeBodyPart) multiPart
								.getBodyPart(partCount);
						if (Part.ATTACHMENT.equalsIgnoreCase(part
								.getDisposition())) {
							// this part is attachment
							String fileName = part.getFileName();
							attachFiles += fileName + ", ";
							part.saveFile(saveDirectory + File.separator + fileName);
						} else {
							// this part may be the message content
							messageContent = part.getContent().toString();
						}
					}
					if (attachFiles.length() > 1) {
						attachFiles = attachFiles.substring(0,
								attachFiles.length() - 2);
					}
				} else if (contentType.contains("text/plain")
						|| contentType.contains("text/html")) {
					Object content = msg.getContent();
					if (content != null) {
						messageContent = content.toString();
					}
				}

				// print out details of each message
				System.out.println("Message #" + (i + 1) + ":");
				System.out.println("\t From: " + from);
				System.out.println("\t Subject: " + subject);
				System.out.println("\t Received: " + sentDate);
				System.out.println("\t Message: " + messageContent);
				System.out.println("\t Attachments: " + attachFiles);
			}

			// disconnect
			inbox.close(false);
			store.close();

		} catch (NoSuchProviderException e) {
			e.printStackTrace();
			System.exit(1);
		} catch (MessagingException e) {
			e.printStackTrace();
			System.exit(2);
		} catch (IOException ex) {
			ex.printStackTrace();
		}
	}

	/**
	 * Runs this program with Gmail POP3 server
	 * @throws ParseException 
	 */
	public static void main(String[] args) throws ParseException {
		String host = "pop.gmail.com";
		String port = "995";
		String userName = "[email protected]";
		String password = "pass";
		Date startDate = new SimpleDateFormat("yyyy-MM-dd").parse("2014-06-30");
		Date endDate = new SimpleDateFormat("yyyy-MM-dd").parse("2014-06-01");
		String saveDirectory = "C:\\Temp";

		EmailReader receiver = new EmailReader();
		receiver.setSaveDirectory(saveDirectory);
		receiver.downloadEmailAttachments(host, port, userName, password,startDate,endDate);

	}
}

Maven Dependency:

<dependency>
	<groupId>com.sun.mail</groupId>
	<artifactId>javax.mail</artifactId>
	<version>1.5.1</version>
</dependency>

Solution 12 - Gmail

Have you taken a look at the GMail 3rd party add-ons at wikipedia?

In particular, PhpGmailDrive is an open source add-on that you may be able to use as-is, or perhaps study for inspiration?

Solution 13 - Gmail

For Java, you will find G4J of use. It's a set of APIs to communicate with Google Mail via Java (the screenshot on the homepage is a demonstration email client built around this)

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionanonView Question on Stackoverflow
Solution 1 - Gmaile-satisView Answer on Stackoverflow
Solution 2 - GmailJeroen LandheerView Answer on Stackoverflow
Solution 3 - GmailjfsView Answer on Stackoverflow
Solution 4 - GmailJDragoView Answer on Stackoverflow
Solution 5 - GmailKevin HainesView Answer on Stackoverflow
Solution 6 - GmailMitesh BudhabhattiView Answer on Stackoverflow
Solution 7 - GmailEric ThomasView Answer on Stackoverflow
Solution 8 - GmailRob WilliamsView Answer on Stackoverflow
Solution 9 - Gmailmoster67View Answer on Stackoverflow
Solution 10 - GmailmsanjayView Answer on Stackoverflow
Solution 11 - GmailjechavizView Answer on Stackoverflow
Solution 12 - GmailtoolkitView Answer on Stackoverflow
Solution 13 - GmailBrian AgnewView Answer on Stackoverflow