Determine file type in Ruby

RubyContent TypeMime TypesFile Type

Ruby Problem Overview


How does one reliably determine a file's type? File extension analysis is not acceptable. There must be a rubyesque tool similar to the UNIX file(1) command?

This is regarding MIME or content type, not file system classifications, such as directory, file, or socket.

Ruby Solutions


Solution 1 - Ruby

There is a ruby binding to libmagic that does what you need. It is available as a gem named ruby-filemagic:

gem install ruby-filemagic

Require libmagic-dev.

The documentation seems a little thin, but this should get you started:

$ irb 
irb(main):001:0> require 'filemagic' 
=> true
irb(main):002:0> fm = FileMagic.new
=> #<FileMagic:0x7fd4afb0>
irb(main):003:0> fm.file('foo.zip') 
=> "Zip archive data, at least v2.0 to extract"
irb(main):004:0> 

Solution 2 - Ruby

If you're on a Unix machine try this:

mimetype = `file -Ib #{path}`.gsub(/\n/,"")

I'm not aware of any pure Ruby solutions that work as reliably as 'file'.

Edited to add: depending what OS you are running you may need to use 'i' instead of 'I' to get file to return a mime-type.

Solution 3 - Ruby

I found shelling out to be the most reliable. For compatibility on both Mac OS X and Ubuntu Linux I used:

file --mime -b myvideo.mp4
video/mp4; charset=binary

Ubuntu also prints video codec information if it can which is pretty cool:

file -b myvideo.mp4
ISO Media, MPEG v4 system, version 2

Solution 4 - Ruby

You can use this reliable method base on the magic header of the file :

def get_image_extension(local_file_path)
  png = Regexp.new("\x89PNG".force_encoding("binary"))
  jpg = Regexp.new("\xff\xd8\xff\xe0\x00\x10JFIF".force_encoding("binary"))
  jpg2 = Regexp.new("\xff\xd8\xff\xe1(.*){2}Exif".force_encoding("binary"))
  case IO.read(local_file_path, 10)
  when /^GIF8/
    'gif'
  when /^#{png}/
    'png'
  when /^#{jpg}/
    'jpg'
  when /^#{jpg2}/
    'jpg'
  else
    mime_type = `file #{local_file_path} --mime-type`.gsub("\n", '') # Works on linux and mac
    raise UnprocessableEntity, "unknown file type" if !mime_type
    mime_type.split(':')[1].split('/')[1].gsub('x-', '').gsub(/jpeg/, 'jpg').gsub(/text/, 'txt').gsub(/x-/, '')
  end  
end

Solution 5 - Ruby

This was added as a comment on this answer but should really be its own answer:

path = # path to your file

IO.popen(
  ["file", "--brief", "--mime-type", path],
  in: :close, err: :close
) { |io| io.read.chomp }

I can confirm that it worked for me.

Solution 6 - Ruby

If you're using the File class, you can augment it with the following functions based on @PatrickRichie's answer:

class File
	def mime_type
		`file --brief --mime-type #{self.path}`.strip
	end

	def charset
		`file --brief --mime #{self.path}`.split(';').second.split('=').second.strip
	end
end

And, if you're using Ruby on Rails, you can drop this into config/initializers/file.rb and have available throughout your project.

Solution 7 - Ruby

For those who came here by the search engine, a modern approach to find the MimeType in pure ruby is to use the mimemagic gem.

require 'mimemagic'

MimeMagic.by_magic(File.open('tux.jpg')).type # => "image/jpeg" 

If you feel that is safe to use only the file extension, then you can use the mime-types gem:

MIME::Types.type_for('tux.jpg') => [#<MIME::Type: image/jpeg>]

Solution 8 - Ruby

You could give shared-mime a try (gem install shared-mime-info). Requires the use ofthe Freedesktop shared-mime-info library, but does both filename/extension checks as well as "magic" checks... tried giving it a whirl myself just now but I don't have the freedesktop shared-mime-info database installed and have to do "real work," unfortunately, but it might be what you're looking for.

Solution 9 - Ruby

I recently found mimetype-fu.

It seems to be the easiest reliable solution to get a file's MIME type.

The only caveat is that on a Windows machine it only uses the file extension, whereas on *Nix based systems it works great.

Solution 10 - Ruby

Pure Ruby solution using magic bytes and returning a symbol for the matching type:

https://github.com/SixArm/sixarm_ruby_magic_number_type

I wrote it, so if you have suggestions, let me know.

Solution 11 - Ruby

The best I found so far:

http://bogomips.org/mahoro.git/

Solution 12 - Ruby

The ruby gem is well. mime-types for ruby

Solution 13 - Ruby

You could give a go with MIME::Types for Ruby.

>This library allows for the identification of a file’s likely MIME content type. The identification of MIME content type is based on a file’s filename extensions.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionClint PachlView Question on Stackoverflow
Solution 1 - RubyMartin CarpenterView Answer on Stackoverflow
Solution 2 - RubyPatrick RitchieView Answer on Stackoverflow
Solution 3 - RubyjamiewView Answer on Stackoverflow
Solution 4 - RubyAlain BeauvoisView Answer on Stackoverflow
Solution 5 - RubyJason SwettView Answer on Stackoverflow
Solution 6 - RubyspyleView Answer on Stackoverflow
Solution 7 - RubyPaulo FidalgoView Answer on Stackoverflow
Solution 8 - RubyChris IngrassiaView Answer on Stackoverflow
Solution 9 - RubyheathandersonView Answer on Stackoverflow
Solution 10 - RubyjoelparkerhendersonView Answer on Stackoverflow
Solution 11 - RubyknoopxView Answer on Stackoverflow
Solution 12 - RubyQianjiguiView Answer on Stackoverflow
Solution 13 - RubyBobby JackView Answer on Stackoverflow