Tib

Posted on Apr 21, 2021 • Edited on Apr 26, 2021

My unrealistic wish-list for Perl 7.x

#perl #discuss

Notice the unrealistic in the title... I did not bothered to check if it was realistic or not 😀 but just wrote my wish-list.

Here is my list:

Unicode and UTF-8 by default
Indexing strings
Native Object Oriented system
Signatures
A config-free CPAN client in the core
Images in POD
Smart Match
A much better REPL
And more...

Unicode and UTF-8 by default

Today when using UTF-8 in Perl, you often have to explicitly declare it. For example like the following:

use open ':std', ':encoding(UTF-8)';
print "\x{2717}\n";

(or you will see a Wide character in print at test.pl line 6.)

On the other hand, if you give the glyph instead the code point, it is working fine, like the following:

$ perl -e 'print "André\n"'
André

(My editor and terminal are UTF-8)

But more strange if you add use utf8 it starts to do bad things:

$ perl -e 'use utf8; print "André\n"'
Andr�

But adding the use open will give you back the correct output:

perl -e 'use utf8; use open ":std", ":encoding(UTF-8)"; print "André\n"'
André

The topic of Unicode and UTF-8 is over complicated and I'm not an expert, but I can say that it's not transparent in Perl 5 and I can give you the link of this blog post 😀

Python 2 had also problems with UTF-8 and required to add a magic comment like this:

# -*- coding: utf-8 -*-

print "André"

Or using "unicode strings" if you wanted to give the code point:

# -*- coding: utf-8 -*-

print u"\u2717";

Python 2 was especially annoying since an UTF-8 character in the source file would lead to an error Non-ASCII character '\xc3', even in a comment (!):

# s = "André"
# BOOM!

But it was improved a lot in Python 3.

The strings are internally represented as Unicode and the default encoding of file is UTF-8:

$ python3 -c 'print("\u2717")'
✗

$ python3 -c 'print("André")'
André

The magic encoding then becomes generally useless except if the encoding is not UTF-8:

# -*- coding: iso-8859-15 -*-
s = "ISO André" # This is not UTF-8
print(s)

If you forget the magic encoding comment, you will get a the terrible error:

SyntaxError: Non-UTF-8 code starting with '\xe9' in file test6.py on line 1, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details

Indexing strings

While we can access items of an array (declared @array) with index $array[1] or slice @array[1,2] we can't apply the same kind of index on simple scalars holding strings.

What I would like to do is:

my $str = "bazinga";
print $str[1];

Or slicing (substring):

my $str = "bazinga";
print $str[1,2];

Both are achievable with substr:

my $str = "bazinga";
print substr($str, 1, 1);

Or by converting before the string into array with split:

my $str = "bazinga"; 
my @array = split("", $str); 
print @array[1,2]

But it is extra gymnastic that I would like to avoid if it was possible 😃

I could even imagine mixing indexing on scalar and array slices:

my @array = ("foo", "bar", "baz"); 
print $array[1]; # "bar"
print $array[1][1] # "a"

This feature is maybe impossible to implement (is there some cases where syntax will conflict? short answer is YES, see below), but do I mentioned that this wish-list was unrealistic?

EDIT : As pointed by "quote-only-eeee" on reddit, it is definitely conflicting since both $a and @a could live together in a Perl program and $a[0] would not be able to choose on which one to apply.

Native Object Oriented system

For those who don't know, there is the initiative Cor(inna) and I'm firmly waiting for it.

Go for the native Object Oriented capabilities of Perl!

Signatures

We discussed a lot already (here and @mjgardner wrote great posts about it here and here)

It is the first thing to come, and I'm very happy to have it.

A config-free CPAN client in the core

We have a CPAN client in the core but it is not deadly simple.

$ cpan Acme::LSD

CPAN.pm requires configuration, but most of it can be done automatically.
If you answer 'no' below, you will enter an interactive dialog for each
configuration option instead.

Would you like to configure as much as possible automatically? [yes] ^C

Maybe we could integrate cpanm or cpm in the core?

If you wonder, python pip is config-free and is in the core (since 3.4)

Images in POD

It's really missing.

You can emulate with HTML in POD:

But it's not very handy.

Having it directly in the POD format would be great and there was an early attempt to add it (see this issue).

Smart match

This is a long-story subject, I know from lectures that it is "hard as hell" to implement feature (in particular has DWIM issues), but I would like it, one day, maybe, if possible, pleeaaaaase...

A much better REPL

The python REPL is far far ahead everything you can have in Perl (but on the other hand, Perl oneliner capacity is far ahead).

My typical workflow for trying ideas in Perl is to write a file then execute it (or alternatively perl -e) while my typical workflow for trying ideas in Python is to type python3 and go with the REPL.

I have seen a talk from RJBS that made extensive and impressive use of Perl debugger, but I'm not doing the same 😀

When a pure pythonista at work made his first contribution to a Perl script, his first remark was "Where is the REPL, I typed perl then my commands but nothing happens".

And more...

I could have added better threads and nativecall (see nativecall and Perl port) to the list but it was already too unrealistic 😃

I also like ideas (isa, sharpy equalities, multi sub...) presented by Paul Evans in his FOSDEM talk that I reviewed here

Top comments (9)

Matthew O. Persico • Apr 21 '21

3,4,5,8 are all fine by me. 7 is a mess; don't go there. As for 6, I want to see MARKDOWN as a supported alternate documentation language. POD was great in its time, but the world has moved on. You've turn it on with a new keyword like

=input [pod/md]

where 'pod' is the default.

Gabor Szabo • Apr 21 '21 • Edited

I would like to see some markup/down/pod that has semantic tags:

Instead of =head2 or ## I'd like to see something that can be used to mark:

function, method, variable, attribute, class etc.

Matthew O. Persico • Apr 21 '21

Actually, I'd use =from instead of =input because it better aligns as the inverse of =for.

BigCox • Apr 26 '21

I must be the only person on this planet that likes the object system the way it is now.

As the person at my job that handles language support, I 100% welcome the utf8 standard. Also I hate pod and agree it should be replaced.

Aristotle Pagaltzis • Apr 26 '21 • Edited

Adding image support to POD is not controversial. There was a proposal for this, and there was conflict surrounding that proposal, but the conflict had nothing to do with supporting images in POD, rather it was about that particular proposal’s introduction of inline YAML to POD. I don’t even think it’s fair to characterise what happened as controversy, since feedback from early on and from many sides universally discouraged the use of YAML. The contributor just forged ahead with it anyway, and when this led to a lack of interest in the contribution, complained loudly and quit.

Tib • Apr 26 '21

OK thank you for the clarifications, I edited my post 😃

Aristotle Pagaltzis • Apr 26 '21

Oh! I didn’t mean this as criticism at all – sorry I wasn’t clearer about that. I just wanted to give hope that that particular wishlist item of yours is not at all unrealistic.

(It will need a well-thought-out proposal, yes (and our current era of high-DPI displays unfortunately makes that somewhat complicated). And it’s not something a lot of people want desperately. But I do believe a good proposal would be widely (if mildly) appreciated. I sure think it’d be a good idea – if done well.)

Salvador Fandiño • Apr 24 '21 • Edited

There are lots of things broken regarding Unicode and UTF-8 in Perl and I really hope all of them are fixed before it becomes Perl 7.

My other favorite items are:

A nice OO framework built-in.
Proper exception handling with the core throwing exception objects instead of strings!
Oh, well, and definitively signatures!!!
A sane threading model (even if it is just cooperative).

Dave Hodgkinson • Apr 22 '21

Not unrealistic, just don't turn Perl into Python.

And steer clear of threads. Here be dragons.

DEV Community

My unrealistic wish-list for Perl 7.x

Unicode and UTF-8 by default

Indexing strings

Native Object Oriented system

Signatures

A config-free CPAN client in the core

Images in POD

Smart match

A much better REPL

And more...

Top comments (9)

Read next

A catastrophic bug: mistakes developers make that cost lives

Free Database Hosting for Your Next Project

Secure API Development: Best Practices with Laravel

Open Source Under Fire: Analyzing the WordPress vs. WP Engine Controversy