Notice the unrealistic in the title... I did not bothered to check if it was realistic or not 😀 but just wrote my wish-list.
Here is my list:
- Unicode and UTF-8 by default
- Indexing strings
- Native Object Oriented system
- Signatures
- A config-free CPAN client in the core
- Images in POD
- Smart Match
- A much better REPL
- And more...
Unicode and UTF-8 by default
Today when using UTF-8 in Perl, you often have to explicitly declare it. For example like the following:
use open ':std', ':encoding(UTF-8)';
print "\x{2717}\n";
(or you will see a Wide character in print at test.pl line 6.
)
On the other hand, if you give the glyph instead the code point, it is working fine, like the following:
$ perl -e 'print "André\n"'
André
(My editor and terminal are UTF-8)
But more strange if you add use utf8
it starts to do bad things:
$ perl -e 'use utf8; print "André\n"'
Andr�
But adding the use open
will give you back the correct output:
perl -e 'use utf8; use open ":std", ":encoding(UTF-8)"; print "André\n"'
André
The topic of Unicode and UTF-8 is over complicated and I'm not an expert, but I can say that it's not transparent in Perl 5 and I can give you the link of this blog post 😀
Python 2 had also problems with UTF-8 and required to add a magic comment like this:
# -*- coding: utf-8 -*-
print "André"
Or using "unicode strings" if you wanted to give the code point:
# -*- coding: utf-8 -*-
print u"\u2717";
Python 2 was especially annoying since an UTF-8 character in the source file would lead to an error Non-ASCII character '\xc3'
, even in a comment (!):
# s = "André"
# BOOM!
But it was improved a lot in Python 3.
The strings are internally represented as Unicode and the default encoding of file is UTF-8:
$ python3 -c 'print("\u2717")'
✗
Or
$ python3 -c 'print("André")'
André
The magic encoding then becomes generally useless except if the encoding is not UTF-8:
# -*- coding: iso-8859-15 -*-
s = "ISO André" # This is not UTF-8
print(s)
If you forget the magic encoding comment, you will get a the terrible error:
SyntaxError: Non-UTF-8 code starting with '\xe9' in file test6.py on line 1, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details
Indexing strings
While we can access items of an array (declared @array
) with index $array[1]
or slice @array[1,2]
we can't apply the same kind of index on simple scalars holding strings.
What I would like to do is:
my $str = "bazinga";
print $str[1];
Or slicing (substring):
my $str = "bazinga";
print $str[1,2];
Both are achievable with substr
:
my $str = "bazinga";
print substr($str, 1, 1);
Or by converting before the string into array with split
:
my $str = "bazinga";
my @array = split("", $str);
print @array[1,2]
But it is extra gymnastic that I would like to avoid if it was possible 😃
I could even imagine mixing indexing on scalar and array slices:
my @array = ("foo", "bar", "baz");
print $array[1]; # "bar"
print $array[1][1] # "a"
This feature is maybe impossible to implement (is there some cases where syntax will conflict? short answer is YES, see below), but do I mentioned that this wish-list was unrealistic?
EDIT : As pointed by "quote-only-eeee" on reddit, it is definitely conflicting since both $a
and @a
could live together in a Perl program and $a[0]
would not be able to choose on which one to apply.
Native Object Oriented system
For those who don't know, there is the initiative Cor(inna) and I'm firmly waiting for it.
Go for the native Object Oriented capabilities of Perl!
Signatures
We discussed a lot already (here and @mjgardner wrote great posts about it here and here)
It is the first thing to come, and I'm very happy to have it.
A config-free CPAN client in the core
We have a CPAN client in the core but it is not deadly simple.
$ cpan Acme::LSD
CPAN.pm requires configuration, but most of it can be done automatically.
If you answer 'no' below, you will enter an interactive dialog for each
configuration option instead.
Would you like to configure as much as possible automatically? [yes] ^C
Maybe we could integrate cpanm or cpm in the core?
If you wonder, python pip
is config-free and is in the core (since 3.4)
Images in POD
It's really missing.
You can emulate with HTML in POD:
But it's not very handy.
Having it directly in the POD format would be great and there was an early attempt to add it (see this issue).
Smart match
This is a long-story subject, I know from lectures that it is "hard as hell" to implement feature (in particular has DWIM issues), but I would like it, one day, maybe, if possible, pleeaaaaase...
A much better REPL
The python REPL is far far ahead everything you can have in Perl (but on the other hand, Perl oneliner capacity is far ahead).
My typical workflow for trying ideas in Perl is to write a file then execute it (or alternatively perl -e
) while my typical workflow for trying ideas in Python is to type python3
and go with the REPL.
I have seen a talk from RJBS that made extensive and impressive use of Perl debugger, but I'm not doing the same 😀
When a pure pythonista at work made his first contribution to a Perl script, his first remark was "Where is the REPL, I typed perl
then my commands but nothing happens".
And more...
I could have added better threads and nativecall (see nativecall and Perl port) to the list but it was already too unrealistic 😃
I also like ideas (isa, sharpy equalities, multi sub...) presented by Paul Evans in his FOSDEM talk that I reviewed here
Top comments (9)
3,4,5,8 are all fine by me. 7 is a mess; don't go there. As for 6, I want to see MARKDOWN as a supported alternate documentation language. POD was great in its time, but the world has moved on. You've turn it on with a new keyword like
=input [pod/md]
where 'pod' is the default.
I would like to see some markup/down/pod that has semantic tags:
Instead of
=head2
or##
I'd like to see something that can be used to mark:function, method, variable, attribute, class etc.
Actually, I'd use
=from
instead of=input
because it better aligns as the inverse of=for
.I must be the only person on this planet that likes the object system the way it is now.
As the person at my job that handles language support, I 100% welcome the utf8 standard. Also I hate pod and agree it should be replaced.
Adding image support to POD is not controversial. There was a proposal for this, and there was conflict surrounding that proposal, but the conflict had nothing to do with supporting images in POD, rather it was about that particular proposal’s introduction of inline YAML to POD. I don’t even think it’s fair to characterise what happened as controversy, since feedback from early on and from many sides universally discouraged the use of YAML. The contributor just forged ahead with it anyway, and when this led to a lack of interest in the contribution, complained loudly and quit.
OK thank you for the clarifications, I edited my post 😃
Oh! I didn’t mean this as criticism at all – sorry I wasn’t clearer about that. I just wanted to give hope that that particular wishlist item of yours is not at all unrealistic.
(It will need a well-thought-out proposal, yes (and our current era of high-DPI displays unfortunately makes that somewhat complicated). And it’s not something a lot of people want desperately. But I do believe a good proposal would be widely (if mildly) appreciated. I sure think it’d be a good idea – if done well.)
There are lots of things broken regarding Unicode and UTF-8 in Perl and I really hope all of them are fixed before it becomes Perl 7.
My other favorite items are:
A nice OO framework built-in.
Proper exception handling with the core throwing exception objects instead of strings!
Oh, well, and definitively signatures!!!
A sane threading model (even if it is just cooperative).
Not unrealistic, just don't turn Perl into Python.
And steer clear of threads. Here be dragons.