Naming Torrents

retiolus@lemmy.cat · 1 year ago

Naming Torrents

Jo Miran@lemmy.ml · edit-2 1 year ago

Dealing with spaces while scripting or in terminal is such a pain in the ass. The true dark path of horror is using spaces indeed.

Skull giver@popplesburger.hilciferous.nl · edit-2 1 year ago

[This comment has been deleted by an automated system]

Alien Nathan Edward@lemm.ee · 1 year ago

I work on a Web app and we recently decided that we’re just not gonna support double quotes in free text fields because oh holy balls what a thing it is to try to deal with those in a way that doesn’t open you up to multiple encoding vulnerabilities.

FooBarrington@lemmy.world · 1 year ago

That’s… Surprising. If you’re doing things right, double quotes should be no trouble at all:

HTTP requests have simple, automatic encoding
SQL queries with prepared statements don’t need any special handling for double quotes
Rendering the data should happen with proper escaping etc.

They are usually only trouble if you’re doing SQL queries wrong (concatenation etc.) or if you’re not escaping your output.

Alien Nathan Edward@lemm.ee · edit-2 1 year ago

The issue is the filter that we’re using to avoid multiple encoding attacks de-escapes everything via multiple rounds, then tries to pass it to the next layer of filtering with the de-escaped request body as a json string. Your absolutely right that this is a silly way of doing it, but sometimes we have to live with decisions that were made before we were onboarded to a project. In this particular case, I pushed to improve the filters but all our PO heard was “spend development time weakening security” and at the end of the day they decide what to do and we do it.

FooBarrington@lemmy.world · 1 year ago

Ah, that’s understandable. Sorry you have to go through that!

WarmApplePieShrek@lemmy.dbzer0.com · 1 year ago

The filter you’re using to avoid multiple encoding attacks creates multiple encoding attacks.

Alien Nathan Edward@lemm.ee · edit-2 1 year ago

You should tell that to OWASP then, they wrote it. org.owasp.esapi 2.5.2.0, class is Encoder, method is canonicalize(String, bool, bool)

WarmApplePieShrek@lemmy.dbzer0.com · 1 year ago

This method is a band-aid patch when your downstream code is all messed up and you can’t fix it. Instead of treating the input string correctly, it just removes anything that might possibly trigger some vulnerability in wrong code.

pete_the_cat@lemmy.world · edit-2 1 year ago

It’s a way bigger pain in the ass than people think it is. I remember having to parse output from a tool for work that had tons of output in tabular format, mixed with normal sentence like strings. JSON, YAML, or XML outputs weren’t available so I had to do a nasty mess of grep, awk, cut, and head/tail, to get what I wanted. My first attempt was literally counting the characters so I could cut out exactly what I needed, but as we all know, hardcoding values is a recipe for headaches later on.

Jo Miran@lemmy.ml · edit-2 1 year ago

Here’s a horror story from literally yesterday. We have been fighting a system for a client for weeks and it has been a nightmare. Our clients just told us that they outsourced some of their work to an Indian outfit but that outfit is unfamiliar with Linux and doesn’t know how to edit text files so they have been downloading the files to their Windows machines, editing them in Windows, then uploading the contaminated text files back into Linux. None of them, not our client nor the outfit they hired, understood why this was a problem. We have no idea what files are affected and we won’t know until they fail because they obviously did not keep track of what they touched.

EDIT: I’m being intentionally vague.

PorkSoda@lemmy.world · 1 year ago

Haha this is up there with having to explain why opening a csv in Excel and then saving means that I don’t want the file.

ramblinguy@sh.itjust.works · 1 year ago

I will never forgive excel for automatically converting all of my dates to some weird ass format, or stripping single quotes randomly, or something other BS that they do for no reason

ddh@lemmy.sdf.org · 1 year ago

My absolute favourite is stripping leading zeroes from any text that looks like a number, then displaying it in scientific notation. But we get Copilot, so it balances out, right?

murtaza64@programming.dev · 1 year ago

If this is about line endings, surely a simple shell or python script could correct them?

m_randall@sh.itjust.works · 1 year ago

There’s already a command for it:

https://linux.die.net/man/1/dos2unix

Astaroth@lemm.ee · edit-2 1 year ago

Does windows add an extra character at the end that gets converted to new line on linux? Because the other day I were copying a script and after pasting it an extra line was added after every single line, even the empty lines.

how it looked when I copied it:

bla
bla

bla

what it turned into:

bla

bla



bla

candybrie@lemmy.world · 1 year ago

Windows uses CR LF (carriage return, line feed), whereas Unix just uses LF. For added fun, macs use CR.

noughtnaut@lemmy.world · 1 year ago

For added fun, macs use CR.

This used to be true, for sure, but I thought this changed with OS X (which is essentially PrettyBSD) ?

candybrie@lemmy.world · 1 year ago

You’re right. Notepad++ still lists macs as using CR for their EOL conversion tool, so I didn’t realize.

Alien Nathan Edward@lemm.ee · 1 year ago

The only reasonable response to this behavior is disproportionate violence

elscallr@lemmy.world · 1 year ago

You can just grep for carriage returns followed by newlines, grep -Pirn '\r\n$' /path/to/whatever. It’ll identify all your problematic files.

Em Adespoton@lemmy.ca · 1 year ago

“\ “ and [tab] and * are your friends. I’ve been using spaces in Unix filesystems since the early 90s with no issues. Also, using terminal fonts that•put•a•faint•dot•in•each•space•character helps.

ShaunaTheDead@kbin.social · 1 year ago

Yeah, either put quotes around it ‘/like this/you can incorporate/spaces/into your paths’ or /just\ escape/your\ spaces/like\ this

silasmariner@programming.dev · 1 year ago

This is fine for the most basic of use cases but once you start looping through file names or what have you, you have to start writing robust correct bash and nobody does that

gears@sh.itjust.works · edit-2 1 year ago

It gets real crazy when you’re sending remote commands so you have to escape the escapes so that the remote keeps them and properly escapes the space

ssh -t remote "mv /home/me/folder\\\ with \\\ spaces /home/me/downloads/

LocustOfControl@reddthat.com · 1 year ago

Yup, this is me with scp. Well, it would be if I didn’t just use asterisks to avoid that PITA.

PoolloverNathan@programming.dev · edit-2 1 year ago

Does SSH require quoting commands?

gears@sh.itjust.works · 1 year ago

It doesn’t for commands without spaces (i.e reboot) You might be able to escape the spaces and not use quotes, I’m not sure

PoolloverNathan@programming.dev · 1 year ago

Might be client-dependent; I’ve regularly ran commands with spaces (e.g. ssh a@a.local ssh b@b.local) without a problem.

cobra89@beehaw.org · 1 year ago

Yeah but at least with periods in the title tab complete will just complete the file name all the way while with a filename with spaces I have to escape the damn space with "\ " like you said. Why do more work when I don’t have to?

Euphoma@lemmy.ml · 1 year ago

My shell seems to autocomplete filenames that have spaces with "\ " already.

Amends1782@lemmy.ca · 1 year ago

Yeah I was gonna say this is something anyone in tech knows, spaces are a plague

the_third@feddit.de · 1 year ago

This is scene, there are standards goddammit!

Yes, there really are.

https://en.m.wikipedia.org/wiki/Standard_(warez)

𝕽𝖔𝖔𝖙𝖎𝖊𝖘𝖙@lemmy.world · 1 year ago

Standards and CONSEQUENCES

https://en.wikipedia.org/wiki/Nuke_(warez)

jimmydoreisalefty@lemmus.org · 1 year ago

I prefer dots over spaces.

Spaces can mess with stuff, double space…

KrummsHairyBalls@lemmy.ca · 1 year ago

I.too.prefer.dots.over.spaces.

pete_the_cat@lemmy.world · 1 year ago

:%s/./ /g

people@kbin.social · 1 year ago

And get the bonus of excellent compression after that, too!

LazaroFilm@lemmy.world · 1 year ago

I prefer  

SchizoDenji@lemm.ee · 1 year ago

Dots sometimes pose problems in arrs.

Mr_Blott@lemmy.world · 1 year ago

Yeah I had dots on my arse once. Turned out I’d been sitting on my keyboard

GhostsAreShitty@lemmy.world · 1 year ago

Either are fine, I just wish there was a more consistent standard like naming ROMs. I want to be able to script renaming everything for Kodi

pete_the_cat@lemmy.world · 1 year ago

Look up SMDB (smoke monster’s database). You can download a tool (I forget what it’s actually called, I think one is called ROM manager) which reads the SMDB files and compares the hashes to your ROMs and will categorize and rename them for you. It looks for duplicates, unofficial releases/hacks/patches, categorizes them by country (US, EU and Japan largely), and more. It’s a pretty nifty tool.

I spent like two hours going through PS1 ROMs and was like “there’s got to be a better way!” (insert cheesy black and white infomerical cutaway), started looking up stuff and there it was. Not all game systems are supported (mostly NES, SNES, Genesis/MegaDrive, and a few others) but you can build SMDB “packs” yourself.

I forget if it works on Windows, but I know it works on Linux and it’s either a script or a compiled binary, I forget which, but you can definitely script it, I’ve done so myself since the command string tends to be a bit long.

Laser@feddit.de · 1 year ago

I think your workflow is not optimal. Are you using software like Radarr and Sonarr? They do the renaming for you and come with Kodi integration. Or is this not feasible?

pete_the_cat@lemmy.world · 1 year ago

I think OP means ROM files for video games systems. Kodi has a RetroArch plugin. As I’m sure you’re aware, Sonarr and Radarr only do TV shows and movies, respectively. Managing ROM packs is a pain in the ass because there are usually thousands of files in a pack (I think there’s something stupid like 9,000 ROMs for NES or SNES).

Skull giver@popplesburger.hilciferous.nl · edit-2 1 year ago

[This comment has been deleted by an automated system]

pete_the_cat@lemmy.world · 1 year ago

There is a database that I found called Smoke Monster’s Database, it’s actually a bunch of “databases” (files, not actually databases) that you load into a program and point it at a directory and it categorizes, organizes, and renamed everything for you.

A lot of ROM packs that are out there are pretty old considering the systems that they’re for are decades old and have been passed around and added to for years. The packs are usually in a flat file structure and there are usually multiple files for the same game (version updates from the manufacturer) so it gets annoying pretty quickly. Do you want to have to scroll through 9000 NES games just to get to the Zelda: A Link to the Past?

GhostsAreShitty@lemmy.world · 1 year ago

Oh it’s totally inefficient. It’s not the most feasible with my current setup, so I’m making do with what I have at the moment.

CmdrShepard@lemmy.one · 1 year ago

In my experience, files are named pretty well these days to include resolution, source, the actual title and release year, video format, audio format, language, and release group.

Try looking at the way music files are named and you’ll see how awful naming conventions can get.

Astaroth@lemm.ee · 1 year ago

why not use underscores?

rustyricotta@lemmy.ml · 1 year ago

I’ve always liked underscores better because it differentiates from the file extension. It just makes sense. Except it is a wider character, so it’d be longer.

Dozzi92@lemmy.world · 1 year ago

Gotta hit shift though. Period, ezpz.

JustEnoughDucks@feddit.nl · 1 year ago

chuckles nervously in azerty hell

TimewornTraveler@lemm.ee · 1 year ago

wow azerty needs shift to enter a period?

Astaroth@lemm.ee · 1 year ago

well if you’re using a mono font (terminal) then there are no such thing as wider characters anyway, so for me that’s not a drawback either

Octopus@thelemmy.club · 1 year ago

They are ugly. Just use -

p3e7@lemm.ee · edit-2 1 year ago

kebab-case-for-the-win

Dozzi92@lemmy.world · 1 year ago

What happens when a hyphen is used in a movie title? I think that’s frequent enough, versus an underscore or a period.

Rogue@feddit.uk · 1 year ago

Semi colons wouldn’t be valid in file names so they’re ignored so there’s no reason to include hyphens either

BrianTheeBiscuiteer@lemmy.world · 1 year ago

Underscores require you to use the Shift key.

Astaroth@lemm.ee · 1 year ago

Do you type with one hand?

And well even if you did you could hold down right shift instead of left shift to only use a single hand.

BrianTheeBiscuiteer@lemmy.world · 1 year ago

When you have RSI you want to minimize every single key press.

01011@monero.town · 1 year ago

Using spaces is so inconsiderate.

retiolus@lemmy.cat · 1 year ago

It’s quite strange, I’ve been downloading torrents for more years than I can count, and I upload them from time to time, and I’ve always had the worry myself of how to name torrents: with dots? underscores? dashes? (although with spaces is definitely not an option).

I’ve even asked the questions on several forums and upload sites, read tutorials on these same sites etc and every time I’ve asked the answer has been: THERE IS NO STANDARD, even on the tutorials, I’ve never seen anything mentioned such a thing.

All this to say that I’m making a meme, and after so many years, this is the first time I’ve heard of a Warez scene, and several times in the same comments!, curious, isn’t it? I wish I’d heard about it before.

Socsa@sh.itjust.works · 1 year ago

You should know that in most filesystems that are not NTFS, spaces in file names are not well supported.

Pyrozo007@lemmy.dbzer0.com · 1 year ago

Can you give examples? Linux and Mac have no real issues as far as I’m aware. Nor exFAT or FAT32

bam13302@ttrpg.network · edit-2 1 year ago

The problem is really that space is an argument separator, so to safely handle filenames with spaces you need to handle them special, either by escaping them, quoting the entire thing. This means that the filename with spaces can’t be just copy pasted wherever you want, you have handle them special. It adds complications that are resolved by just using a separator that isnt used for other things, like underscore, or dash. Dot I also don’t like as much as it’s used as a separator for extensions, but that’s a far easier problem to handle by just ignoring all but the last dot, leaving only one really bad edge case (a file that does not have an extension, that uses dot separator in its filename having the filesystem imply a wrong extension.

ramjambamalam@lemmy.ca · 1 year ago

That’s a problem with the shell though, not the filesystem. It doesn’t matter which files filesystem you’re using; most interactive shells use spaces as token separators and therefore spaces in filenames need to be enclosed in quotes or escaped.

gayhitler420@lemm.ee · 1 year ago

I’m with the person you’re replying to, what’s an example? I haven’t had a problem working with filenames with spaces in at least ten years on windows, Linux or Mac…

retiolus@lemmy.cat · 1 year ago

Have you ever written a program or simply used a terminal?

gayhitler420@lemm.ee · 1 year ago

Escape characters and autocomplete exist.

It’s also really good practice to account for weird characters in programs and shell scripts you write because then you don’t have injection vulnerabilities or unicode problems.

Seriously, what’s an example of spaces in filenames causing a problem?

bam13302@ttrpg.network · edit-2 1 year ago

for f in *.txt; do cat $f; done

Will error for example. It works fine for filenames without space, but if the filename has space in it, it will be interpreted wrong. But if your testing batch doesn’t have spaces in the filename, you won’t see the issue until it’s used on a file that does. Note ‘cat’ is a placeholder, any function/script that can be used on a file here will have the same issue.

Something similar to that caught me last week while I was unzipping multiple mods in bulk for a game.

eluvatar@programming.dev · 1 year ago

Clearly the best option then is to just use some of each. Like this: “MovieTitle-2000.Your_mom h.265”

WarmApplePieShrek@lemmy.dbzer0.com · edit-2 1 year ago

Scene has standards. You don’t have to be scene to use scene naming standards. https://scenerules.org

JokeDeity@lemm.ee · edit-2 1 year ago

As soon as the file finishes downloading it becomes only the name of the movie.filetype

I can’t stand the titles on torrents.

bloup@lemmy.sdf.org · edit-2 1 year ago

“Titles”? It’s not a title, it’s a file name that contains a lot of details about the rip. In the post’s example it tells you that it’s the movie Split, ripped from blu ray, in 1080p, with audio tracks in Italian and English, and encoded in x265. You probably would hate a lot more not being able to tell the difference between split.mp4 recorded on my cellphone in the movie theater and split.mp4 in ultra hd 4k ripped straight from Netflix.

JokeDeity@lemm.ee · 1 year ago

Lol, okay. Calm down buddy. What I do doesn’t affect you. The torrent description let’s me know all that too, I just hate having those file names in my library, looks messy and it’s less easy for my eyes to browse quickly.

bloup@lemmy.sdf.org · 1 year ago

I mean I never told you not to rename them lmfao. You just said “I can’t stand the titles on torrents” like people just made these really long filenames for shits and giggles. Also lots of torrent sites will feature several different kinds of rips. It’s not very convenient on the back end to have all rips of the same movie have the same file name.

Also “calm down”? Idk I thought I gave a pretty chill explanation of why things are the way they are but sorry if it didn’t come across that way.

BitsOfBeard@programming.dev · 1 year ago

These days, it feels like one needs a disclaimer for every opinion or fact just to avoid setting someone off. I feel like it discourages open conversation…

Herbal Gamer@sh.itjust.works · 1 year ago

fuck your open conversation /s

kadu@lemmy.world · 1 year ago

deleted by creator

retiolus@lemmy.cat · 1 year ago

This.is.a.meme.

metaStatic@kbin.social · 1 year ago

This.is.a.meme.exe

retiolus@lemmy.cat · 1 year ago

deleted by creator

ShortFuse@lemmy.world · 1 year ago

Should be a hyphen instead of period before NAHOM.

balderdash@lemmy.zip · 1 year ago

Why are spaces bad? Does it mess with sonarr/radar or something?

0x4E4F@infosec.pub · edit-2 1 year ago

It’s legacy, white spaces weren’t allowed as characters on most FTP software, which is how the warez scene shares it’s releases. It used to be underscores, but dots are closer to a white space regarding separation (space wise), so most release groups use dots nowadays.

Generally, a white space as a character in filenames and directories is “frowned upon” in many operating systems, Windows included (somewhat). It makes writing scripts and software more comlicated because it’s used as as a separator for giving command line/terminal options to commands and binaries (programs).

originalucifer@moist.catsweat.com · 1 year ago

it goes way back before ftp… i believe its because the original operating systems filesystems/namespacing could not handle the space character at all. so all files lacked spaces in their names. but only for like the first 30 years

0x4E4F@infosec.pub · 1 year ago

Yes, you’re correct, it goes much further back than FTP, all the way down to UNIX I believe. The problem was commands and parameters (options) which use a white space to seperate between them. So, filenames and directories were’t allowed to have white spaces in them.

biscoot@lemmy.getmeotter.work · 1 year ago

30 years ain’t small

retiolus@lemmy.cat · 1 year ago

Spaces are a headache whenever you’re not using a graphical interface.

pete_the_cat@lemmy.world · 1 year ago

Quote\escape all the things!

the_third@feddit.de · 1 year ago

Yes, but, no.

n3m37h@lemmy.dbzer0.com · 1 year ago

Name [Year] (1080p265)
This is how I sort my movies at least

The£0b°t°m¡§t@lemmy.dbzer0.com · 1 year ago

Mine is: [Year] Name [Languages][Resolution]

ElectricCattleman@lemmy.world · 1 year ago

[Year] Name

For me. I don’t care about resolution after I’ve downloaded it. Heck, I don’t need to know the resolution before downloading, I can tell by the file size.

SchizoDenji@lemm.ee · 1 year ago

I usually add tvdb id for Series so that fixing identification problems in jellyfin is easier.

umbrella@lemmy.ml · 1 year ago

i just let radarr do it for me nowadays

lukini@beehaw.org · 1 year ago

It’s supposed to be a dash before the group name.

TheInsane42@lemmy.world · 1 year ago

When searching, dots, when downloading, who cares?

When searching, dots act as and, spaces as or (at least in qtorrent). The dots makes searching easier.

otp@sh.itjust.works · 1 year ago

What do all the commas do?